DOCOBZiT 6BS0HZ 

TH 003 615 

StQfflebeai, Daniel 

Tovard a Technology for Evaluating ETaluation. 
Apr 7a 

103p.; Paper presented at the iaerican Educational 
Pesearcb Association Annual* Heeting (Chicago^ 
Illinois, April 15-19, 1974) ^ 

nP-fO.75 BC-$5.aO PLOS POSTAGE 

Definitions; Educational , Accountability ; ^Evalcatxon ; 
♦Evaluation Criteria: Evaluation peeds; ♦Evaluation 
Techniques ; Guides; ♦Hodels; Technology 
*Heta Evaluation 



The aia of this paper is to present a logical 
structure for the evaluation of evaluation (Beta-evaluation) and to 
suggest ways of conducting such evaluations. Part I contains an 
analysis of background factors and probleas associated with 
■eta-evaluation — that is, the evaluation of evaluation. This part 
discusses the need for seta-evaluation and suaaarizes soae the 
pertinent literature. Suggestions are aade concerning what criteria 
should guide the developaent of a aeta-e valuation aethcdology. The 
final and aajor portion of Part I is an enuaeration of 6 classes of 
probleas that jeopardize aethbdology. The^econd part of the paper is 
a conceptual "response to the first part. Part II contains a 
definition of aota-evaluation and a set of preaises to undergird a 
conceptualizatiou of aeta-evaluation. Host of part two ic devoted to 
a logical structure for designing aeta-evaluation studies. The third 
part of the paper is an application of the logical structure 
presented in Part II. Basically Part III contains 5 aeta-evaluation 
designs. Four of the designs are for use in guiding evaluation vork, 
end the fifth is used in judging coapleted evaluation . vork. Taken 
together the three parts of- the paper are intended to provide a 
partial response to the needs for conceptual and practical 
devel^paents of meta-evaluation. (Authcr/WLP) 



ID 090 319 

AOTHOR 
TITLE 
FOB DATE 
NOTE 



EDF5 PRICE 
DESCPIPTORS 



IDENTIPIEBS 
ABSTRACT 



ERLC 



TOWARD A TECHNOLOGY FOR EVALUATING EVALUATION 



Daniel L. Stufflebeam 
Director of The Evaluation Center and 

Professor of Educational Leadership 
College of Education 
Western Michigan University 



Paper Presented at the 1974 National Convention 

of the 

American Educational Research Asv.oc1at1on 
Chicago, Illinois 



April, 1974 



CONTILNTS 

PP 

Introduction 1 

I. Background and P xblrns. 3 

If " The Importance of Meta [^valuation 3 

Available Keta-E /aluation Concepts and Approaches 3 

The Need for Kcr^ '^leta-Evaluation Concepts end Tools 4 

Meta-Evaluation Criteria 5 

ProbleiTss that Jeopardize Evaluation 11 

Conceptual Problems 12 

Sociopolitical Problems 24 

Contractual/Legal Problems 35 

Technical Problems 45 

Management Problems 53 

Moral, Ethical and Utility Considerations 62 

II. A Conceptualization of Meta-Evaluation .... 68 

Meta-Evaluation Defined 68 

Premises 69 

A Logical Structure for Meta-Evaluation 70 

Purposes of Meta-Evaluation 72 

Steps in the Meta-Evaluation Process 73 

Objects of Meta-Evaluation /3 

Interactions of the Three Dimensions 74 

III. Use of the Conceptualization of Meta-^valuat^on 76 

Design #l--for Pro-active Assessment of Evaluation Goals 76 

Design #2-"-for Pro-active Assessment of Evaluation Designs 80 
Design #3--Pro-active Assessment of the Implementation of a 

Chosen Evaluation Design 82 
Design #4--for Pro-active Assessment of the Quality iind Use 

of Evaluation Results 86 

Design HS^-ior Retroactive Assessment of Evaluation Goal^ 87 

Sunmary ^ 96 

Figures 

Figure 1 17 

Figure 2 28 

Figure 3 29 

^ Figure 4 71 

Figure 5 77 

Figure 6 1 83 

Figure 7 88 

Figure 8 91 

Footnotes ,97 



ERLC 



1 



Introduction 

Good evaluation requires that evaluation effor*:s themselves be evaluated. 
Many things can and often do go wrong in evaluation work. Accordingly, it is 
necessary to check evaluations for problems such as bias, technical error, 
administrative difficulties, and misuse. Such checks are needed both to improve 
ongoing evaluation activities and to assess the njerits of completed evaluation 
efforts. The aim of this paper is to present a logical structure for the 
evaluation of evaluation and to suggest ways of conducting such evaluations. 

For ease of cofumunication, Michael Scriven's label "Meta Evaluation"* will 
be used to refer to evaluations of evaluation.^ The term "primary evaluation" 
will refer to the evaluations that are the subject of meta evaluations. 

The main basis for this paper is work performed in the Ohio State University 
Evaluation Center between 1963 and 1973. That work provided many occasions for 
addressing meta^evaluation issues. The work included developing evaluation systems 
and assessing the work of such systems. The Center staff also designed and 
conducted several evaluation studies, trained graduate students and practitioners 
to conduct evaluation work and critiqued many evaluation designs «!nd reports. 
Those experiences presented many problems in evaluation work and therefore 
opportunities for response. Those problems and responses are the basis for this 
paper. 

This paper has three major parts. Part I contains an analysis of background 
factors and problems associated with rrreta-evaluation. This part discusses 
the need for meta-evaluation and summarizes some of the p^^rtirrent literature. 
Suggestions are made concerning what criteria should guide the development of a 
meta-evaluation methodology. The final and major portion of Part I is an 



2 



enumeration of six classes of problems that jeopardize evaluation work, and 
that therefore need to be addressed by a meta-evaluation methodology* 

The second part of the paper is a conceptual response to the first part. 
Part II contains a definition of meta-evaluatlon and a set of premises to 
unde>^gird a conceptualization of meta-evaluation. Most of part two is devoted 
to a logical structure for designing meta^evaluation studies. 

The third part of the paper is an application of the logical structure 
presented in Part IK Bastcally Part III contains five meta-evaluatlon designs. 
Four of the designs are for use in guiding evaluation work, and the fifth is 
use In Judging completed evaluation work. Taken together the three parts of 
the paper are intended to provide a partial response to the needs for conceptual 
and practical developments of meta-^valuation. 



ERLC 



3 



K Background and Problenis 

The Importance of Heta-Evaluation 

The topic of meta-evaluation is timely because evaluators increasingly 
are being required to evaluate their work. During the past ten years there has 
been a great increase in evaluation activity at all levels of education. 
Thousands of federal projects have been evaluated, over half of the states 
have started work on accountability systems, and several school districts 
have instituted departments of evaluation. Such activity has cost millions 
of dollars. It has been of variable quality, and there has been great con- 
troversy over its worth. For example, Egon Cuba wrote of the "failure of 
Educational Evaluation." Overall, evaluators have come under much pressure 
to insure and demonstrate they are doing quali :y work. 

Available Meta-Evaluation Concepts and Approaches 

The literature of evaluation provides some guidance for evaluating 
evaluation work. In the 1969 Educational Products Report ,^ Michael Scriven 
introduced the term meta evaluation and applied the 'Jriderlying concept to the 
assessment of a design for evaluating educational ^^roducts. Leon Lessinger,^ 
Kalcol^n Provus,^ Richard Seligman,^ and others ''ia>'e discussed the concept under 
the label of educational auditing. The APA tecKni>:al standards for test 
[ development'^ and the Buro's Mental Measurements Yearbooks^ are useful meta 

evaluation devices, since they assist in evaluating evaluation instruments. 
Likewise the Campbell-Stanley piece on quasi-experimental design and true 
experimental design^ is a useful tool for evaluating alternative experimental 

3 

ERLC 



designs. Campbell and Stanley, Bracht and Glass, The Phi Delta Kappa 
Study Committee on Evaluation Krathwohl,^^ and Stuf flebeam^^ have pre- 
pared statements of what criteria should be applied in meta-evaluation work. 

As a part of an NIE effort to plan evaluation of R and D Centers and 
regional laboratories, teams chaired by Michael Scrlven and Daniel Stufflebeam 
prepared alternative plans for evaluating lab and center evaluation systems. 

Richard Turner has presented a plan for>^valuating evaluation systems >n 

18 

NIE's Experimental Schools Program. Thomas Cook has written an extensive 

19 

paper on secondary evaluation, and Michael Scrlven recently has developed 
a paper on how to assess bias in evaluation work. There Is, then, an emergent 
literature In meta-evaluation, and there are some devices for carrying out 
meta-evaluation work. 

The Need for New Meta-Evaluation Concepts and Tools 

However, the state of the art of meta-evaluation is limited in scope. 
Discussions oV the logical structure of meta-evaluation have been cryptic 
and have appeared in only a few fugitive papers. These conceptualizations 
lack reference to research on evaluation and they do not include extensive 
analyses of probleins actually encountered in practical evaluation work. The 
writings on meta-evaluation have lacked detail concerning the mechanics of 
meta-evaluation. While some devices, such as technical standards for tests, 
exist the available tools for conducting meta-evaluation work are neither 
extensive nor well organized. Finally^ there are virtually no publicized 
designs for conducting meta-evaluation work. Overall, the state of the art 
of meta-evaluation is primitive, and there is a need for both conceptual 
and technical development of the area. 



Meta-Evaluation Criteria 



In developing a methodology for meta-«valuation, it is important to have 
In mind an ?ppropriate set of criteria for judging evaluation designs and 
reports. The existence of an adequate list of such criteria is crucial since 
the criteria prescribe necessary and sufficient attributes of evaluation reports. 

A good place to start in generating criteria for judging evaluation reports 
Is with accepted criteria for research reports. This is because criteria have 
been explicated for evaluating research and because both research and evaluation 
reports must contain sound information. 

Criteria for judging research are suggested in the writings of Campbell 
and Stanley;20 Gephart, Ingle and Remstad;^^ and Bracht and 
Glass^^. Basically, these authors have agreed that research must produce 

findings that are internally and externally valid; i.^e.* the findings must be 
true and they must be general izable. 

The criteria of truth and general izabi lity apply both to evaluation and 
research. The findings of both types of studies must be unequivocal; this is 
a concern for truth or internal validity. Also there is a concern 1n both 
types of studies for external validity, which is the ability to extrapolate 
the findings observed in one case to those that would be observed in other cases. 
Whereas there is less of a need in evaluation than in research to generalize 
beyond a particular organizational setting, there is nevertheless a concern 
that evaluation findings be generalizable to' some specified set of circumstances. 
Hence, the findings of both research and evaluation must meet standards of 
both internal and external validity. 

However, internal and external validity a^e necessary but not sufficient 
criteria for judging evaluation findings. In addition to producing good 



6 

information, evaluation must produce findings that are useful to some audience: 
and the findings must be worth more tc> the audirnces than the cost of obtaining 
the information. In summary, evaluation like research, must meet standards 
of technical ad equacy (internal and extern^il validity); but evaluation findings 
must also be useful and cost/ effective . 

These three standards have been spelled out by Cuba and Stuff lebeam, 
and the Phi Delta Kappa Study Committee in the form of eleven specific 
criteria. These criteria are identified and explained below. 

The specific criteria of technical adequacy are a*^ follows: 

1. Internal Validity . This criterion concerns the extent to which the 
findings are true. Does the evaluation design answer the questions it is 
intended to answer? Are the results accurate and unequivocal? Clearly any 
study, whether research or evaluation, must at a minimum produce accurate 
answers to the questions under consideration. 

2. External Validity . This criterion refers to the general izabi 1 ity 
of the information. To what persons and program conditions can the 
findings be applied? Does the information hold for only the sample from 
which it was collected or for other groups? Is it time bound, or are the 
findings predictive of what would occur in future applications? Basically 
meeting criteria of external validity means that one can safely generalize to 
some population of interest, some set of program conditions, and some milieu 
of environmental circumstances. Thus, in evaluation (as in research) it is 
important to define the extrapolations one wants to make from the study results 
and to demonstrate whether the findings warrant such extrapolations. 

3. Reliability . This criterion concerns the accuracy of the data. How 
internally consistent are the findings. How consistent would they be under 

ERLC 



test-retest conditions? If the findings lack precision and reproducibility, 
one should be concerned whether the findings are simply random and therefore 
meaningless. 

4- Objectivity , This criterion concerns the publicness of the data. 
Would competent judges agree on the results? Or, are the results highly 
dependent on the unique experiences, perceptions, and biases of the evaluators. 
It is possible that findings provided by the set of judges could be reproducible 
and therefore reliable but heavily biased by the judges* predelections. Unless 
the findings would be interpreted similarly by different but equally competent 
experts, the true meaning^ of the results is subject to question. 

The above scientific criteria are ones that should be met by all 
evaluation and research, studies. However these criteria are insufficient to 
judge the value of evaluative information. Whereas research must only produce 
results that are true, evaluation must also produce results that are useful. 
Thus criteria for judging evaluations must also consider whether the findings 
are informative to the users and whether the findings make a desirable impact on 
practice. Six criteria are relevant here. Each one involves an explicit or 
implicit interaction between the evaluative findings and some audience. 

The criteria for judging the utility of evaluation findings are: 

5. Relevance . This criterion concerns whether .the findings respond to 
the purposes of the evaluation? What are the audiences? What information 
do they need? To what extent is the evaluation design responsive to the stated 
purposes of the study and the information requests made by the audiences? 
This concern for relevance of evaluation findings is crucial if the findings are 
to have something more than academic appeal and if they are to be used by the 
intended audiences. 



8 



Application of the criterion of relevance requires that the 
evaluation audiences and purposes be specified Such specifications 
essentially result in the questions to be answered. Relevance is determined 
by comparing each datum to be gathered with the questions to be answered. 

5. Im portance . This involves determining which particular data 
should be gathered. In any evaluation study a wide range of data 1n response 
to specific questions are potentially relevant to the purposes of the study. 
However, practical considerations dictate^ that only a part of the potentially 
relevant data can be gathered, Hencp. the evaluator snould choose those 
data that will be most useful in serving the purposes of the study. To do 
this* the evaluator needs to rate each potentially relevant datum on its 
importance for meeting the purposes of the study, and he needs to know the 
priorities the audiences assign to the various data. Then, based on his own 
judgments and those of his audiences, the evaluator needs to choose those 
data that are most significant for the purposes of the evaluation. 

7. Scope . A further condition of utility is that evaluative information 
have adequate scope. Information that is relevant and Important may yet fail 
to provide all the information that the audience needs, for the evaluation 
may fail to address all the important questions. The Michigan Assessment Program 
15 a case in point. This program's purpose is to assess the educational needs 
of students in Michigan. In practice it provides data about the reaolng and 
mathematics performance of 4th and 7th graders. While these data are relevant 
and important for assessing certain educational needs of many students, the 
data are very limited in scope. They pertain to students in only two grades. 
The data don't provide information about interests, motivation, self- 
concept, or emotional stability. Nor do they assess needs in science, art. 



ERIC 



nusic, or a lot of other areas. This illustrates that evaluative data must 
not only ineet the purpose of the evaluation (relevance) and focus on the most 
significant varlabTes (Importance); they must also respond to the full ifenge 
of study questions. Otherwise the scope of the results Is limited. 

8. Credibility , This criterion concerns whether the audience trusts 
the^valuator and supposes him to be free of bias in his conduct of the 
evaluation. Audiences often are not In a position to assess the technical 
adequacy of a study. The next best thing they can do Is decide whether they 
have confidence In the group that conducted the study. 

. This factor is often correlated with the matter of independence. 
In some cases the audience for a study wouldn't trust the results If^they 
were self assessments, but would accept perhaps identical results If they had 
been obtained by some impartial, external evaluator. In other cases a self 
assessment conducted by an Internal team might be completely acceptable to the 
audience. ' 

It Is crucial that the criterion of credibility be met by the study. 
However technically adequate the findings may be, they will be useless If the 
audience puts no stock in their credibility. * For this Important reason, 
evaluators must exercise great care liT Insuring that the study will be 
viewed as credible by members of the intendec^ audience. The meta evaluator 
needs to assess how much trust the audience places In the evaluation. Whether 
an insider or an external agent, the evaluator can do much to insure credibility 
for his study by carrying out his study openly ^nd by cons^t^stently demonstrating 
his professional integrity. 

9. Timeliness . This is perhaps the most critical of the utility criteria. 
This is because the best of Information is useless If it Is provided too late 
to serve its purposes. 

s 



10 

In research we are not concerned about timeliness, for the sole 
aim is to produce new knowledge that is internally and externally valid. 
It Is thus appropriate that researchers take whatever time they need to 
produce information that is scientifically adequate. 

In evaluation, however, the purpose is not to produce new knowledge 
but to influence practice. Therefore the practitioner must be given the 
Information he needs wh^*n He npeds it. 

In evaluation work this almost always creates a conflict. If the 
evaluator optimizes the technical adequacy of the information he obtains, 
he almost certainly will not have his report ready when it is needed. If 
he meets the time constraints of his audiences, he probably will have to 
sacrifice some of its technical adequacy. The position taken here is that the 
evaluator should strive to provide reasonably good Information to his 
audience at the time they need it. 

10. Pervasiveness . This final utility criterion concerns the dissemination 
of the evaluation findings. Clearly the utility of an evaluation can be 
partially gauged by determining whether all of the intended audiences receive 
and use the findings from the evaluation. If an evaluation report that were 
intended for use by teachers and administrators were provided to a chief 
administrator who in tjrn did not distribute the findings to his teachers, we 
would say the findings were not pervasive. This criterion is met when all 
persons who have a leed for the evaluation findings do in fact receive and 
use them. 

Overall the four technical and six utility criteria listed above underscore 
the difficulty of the evaluator's assignment. The evaluator's work must be 
judged on •similar grounds to those that are used to judge the scientific 



n 

adequacy of research reports. But the evaluator's report will also be judged f^r 
Its relevance, importance, scope, credibility, timeliness, and pervasiveness • 
It is apparent that the evaluator cannot insist on optimizing any one criterion 
If he is to optimize his overall effort. Rather he must make many compromises 
In respor.ding to these criteria and imist strive to strike the best balance he 
can in se^tisfying standards of technical adequacy and utility. 

To make matters worse for the evaluator there is yet a third standard 
to be applied to the evaluator's work. This is the prudential concern of / 
efficiency . 

11. Efficiency . Efficiency refers to the need to keep evaluation costs 
as low as possible without sacrificing quality. Proper application of the 
utility criteria of relevance, scope, importance, and timeliness should eliminate 
the grossest of inefficiences. However, there are always alternative ways of 
gatheririg and reporting data, and these vary in their financial and time 
requirements. Thus, care must be taken to choose the most efficient ways of 
implementing the evaluation design. It is also important that evaluatbrs 
maintain cost and impact records of their evaluation activities. In this way 
they will be able to address questions about the cost/effectiveneSs of their 
work. In the long run evaluators must demonstrate that the results of their 
efforts are worth more than they cost. 

In summary, evaluations should be technically adequate, useful, and 
efficient. The eleven criteria presented above are suggested to meta-evaluatorT 
for their use in assessing evaluation designs and reports. 

Problems that Jeopardize Evaluation 

It is one thing to determine whether evaluation results meet the eleven 
criteria described above. It is quite another thing to insure that these criteria 

ErJc will be met. " j 



For the latter purpose, it is necessary to be able to predict the 
problems that may jeopardize an evaluation study and to introduce appropriate 
preventive measures. 

In my past evaluation experience I have encountered c great many problems. 
For some time I have thought it would be helpful if evaluators could have 
available a list of such problems. Such a list would help evaluators predict 
and counter problems before they happen. 

The following pages introduce and delineate six classes of problems that 
I believe are commonly encountered in evaluation work. These classes are 
conceptual, sociopolitical, contractual/legal, technical, administrative, and 
moral/ethical. Basically, these problems are suggested for use in improving 
ongoing evaluation work (the matter of formative meta-evaluatlon) , but they 
should also prove useful in assessing and diagnosing completed evaluation 
studies (a concern of summative meta-evaluation). Each of the six problem 
areas is defined and then explicated through the identification of specific 
subprobVems. 

Conceptual Problems 

This problem area concerns how evaluators conceive evaluation. Evaluation 
is typically a team activity. As a basis for effective communication and- 
collaboration among the team members it is necessary that they share a 
conmon and well defined view of the nature of evaluc^,t1on. Otherwise their 
activities won't complement each other toward achieving some shared 
objectives of the evaluation. 

Also alternative conceptual^' nations of "valuation might be adopted. 
Depending on which one is chosen, the eveluators will produce evaluation 
outcomes that differ both in kind and quality. For example, a conceptajaliza- 
tion that insists on the evaluation of goals will produce different results 



13 

from one that insi5ts on a goal-free approach, and an approach th^t emphasizes 
impressionistic analyses likely will yield less valid and reliable results 
than an approach that requires close conformance to technical standards. 
Since adherence to different conceptualizations of evaluation may lead to re- 
sults that are different in kind and quality, it is important that teams of 
evaluators carefully consider and document the approach used to guide their 
activities. 

In addressing this problem area I believe evaluators should answer seven 
general questions about evaluation: what is evaluation, what's it for, what 
questions does 1t address, who should It serve, who should do it, how should 
they do it, and by what standards should their work be judged. Each is pre- 
sented and described below^ 

1. What is evaluation? 

This first question concerns the definition of evaluation. What is 
it? One can respond to this question in a variety of ways. 

One way is to define evaluation as "determining whether objectives 

t 

have been achieved." This is the most common and classical way of defining 
evaluation. It focuses attention on outaxes and suggests that stated object- 
ives be used to determine the worth of the outcomes. It doesn't call for the 
assessment of objectives, project plans* and process* nor <ioes it amphasize on- 
going feedback designed to help design and develop projects. "Relating outcomes 

I 

to objectives," then, is one way of responding to the question "What is 

\ 

evaluation;" and this response has certain characteristics and limitations. 

Another possible response is that evaluation is "the process of 
providing information for decision making." This definition explicitly offers 
ongoing evaluative feedback for planning and conducting programs. Also this 

fr 

definition is broader than the previously described definition. This is 



14 



because providing infomation for decision making implies that the evaluation 

would not only focus on outcomes but would also provide information for 

choosing goals and designs and for carrying out the designs. However, this 

definition is also limited; while it focuses on pro-active evaluation to 

serve decision making it does not reference the need for retroactive evaluation 

to serve accountability. So, the decision-making oriented definition, like the 

outcomes oriented one, has both distinguishing characteristics and limitations. 

A third possible response is the one found in standard d1 :tiondr1es. 

This one amounts to saying that evaluation Is the ascertainment of merU. This 

definition is broad enough to encompass all questions about value, quality, 

or amount that one might imagine, and is not, therefore, as limited as the first 

two definitions. Also, Its generality admits the possibility of providing 

information for both decision making and accountability. Of course, it Is 

communicable since it is consistent with common dictionary definitions. Its 

weakness is in its lack of specificity. 

These three definitions illustrate that there are alternative ways of 

responding to the question, what is evaluation. Definitions that focus on 

assessing outcomes, serving decision making, and assessing overall worth have 

been mentioned. Other possibilities include equating evaluation to testing, 

25 

to professional judgment, and to experimental research. The way that a group 
chooses to define evaluation has an important influence on what they produce; 
hence, how a group defines evaluation is an important consideration in the 
evaluation of their work. 

2. Whafs it for? 

The second question concerns the [>urposes of the evaluation . For 
what purposes are the evaluation results to be used? Again one can respond 
to this question in alternative ways, and the purposes that an evaluation 



15 

teara chooses to serve can drastically affect what data they collect, how they 
collect it, how they report it, and how others will evaluate It. 

One possible purpose to be served by evaluation has already been implied 
by one of the alternative definitions of evaluation. This purpose is to provide 
Information for decision making . Invoking this purpose requires that the 
evaluators place great emphasis on the utility of the information that they 
gather and report. In effect they mst conduct their evaluation work pro- 
actively so as to continually provide timely information for decision making. 
This is much like Scriven's notion of f ormative evaluation . 

Another purpose of evaluation that we hear a lot about these days is 
accountability . This means maintaining a file of data that decision makers 
can use to be accountable for--to defend--their past actions. Serving this 
purpose calls for a retrospective approach to evaluation which is much the 
same as Scrfven s concept of summative evaluation . 

Still a third purpose involves developing new knowledge that is internally 
anJ externally valid. Early in this paper I defined this type of act^iVity as 
research and not evaluation. However > many persons equate the two concepts. 
When they do, one can foresee possible troubles related to the outcomes. If 
the inquirer suboptimizes the criteria of technical adequacy, his findings 
will probably lack utility. But if he claims to be doing research and doesn't 
insist on meeting the criteria of technical adequacy to the exclusion of 
utility criteria, the outcomes may likely be judged as bad on research grounds*- 
whether or not the findings are useful. 

These examples illustrate that evaluations may serve different purposes. 
Further J the purposes suggest different criteria, or at least different emphases » 
for judging the results of evaluative efforts. Also the evaluators can get 
into trouble if they set out to serve a different set of purposes than those 



16 

that their sponsors and audiences have In mind* Thus, evaluators should be 
explicit about the purposes they are servinq and meta-evaluators should assess 
the clarity, consensus, and Implications of those purposes. 

3. What questions do evaluations address? 

This third question concerns the foci of the evaluation. What 
questions might be addressed? Which ones will be addressed? 

Classically, evaluations have addressed questions* about outcomes . 
This is certainly one important focus for evaluation work, but it Is only one. 
Evaluations may also assess the merit of goals , of designs for achieving the 
goals, and of efforts to implement the designs. 

Kany specific questions might be addressed in relation to goals, designs^ 
Implementation, and results. These vary according to the substance of what 
is being evaluated. Also they vary according to the purpo$e(s) being served. 
For example, if the purpose Is to serve decision making and if the focus Is 
Implementation, the evaluator might concentrate on identifying potential barriers; 
but if the purpose is to serve accountability in relation to the Implementation 
of a design, the evaluator would need to document the total implementation 
process. 

One way of identifying and analyzing potential evaluation questions 
is to develop and fill out an appropriate matrix. Its verticle dimension 
should include t)^ purposes of the evaluation, i.e., decision making and 
accountability. Its horizontal dimension should include the four categories 
of goals, designs, impleroentation, and results. Figure 1 illustrates the use 
of such a matrix; its cells have been filled out to illustrate the evaluative 
questions that might be addressed in an evaluation study. 

ERIC 



«l 



9 



C 



o 



ERIC 



3 



o c 
o 



oi«^ « 



VI C 

> 4 



♦> O 
t. 

4-» O. 

C- 

«> 1. 

C7I O «> 

Ol 

O VI VI 
04 CO 

^ e 

C ^ VI 
M 3 V CI 



^ m U «f O 

c^ x: oi 



1- 

4? o 



m m G 9» %t 9 

£ u u u 



» o x: 
Di VI 



C7I 
Ol 

M X2 



3 > 



52 

o. 



cr 

e p 
u a 

_ C7I 
Cj*w o ^ 
VI u <o 

Irt OJ «-» O.-r- 

ei d/ J= c 

x: -r- ^ w a> c 

4> «^ S i= Of 
VI > oi o 

•r* «0 «r-> VI 



_ o» 

Oi Uh- 

cr VI 



o o 

XS 1-, 
CI «9 



CI 



^ <o ^ •O 



13 W CI 4 
C U CI 
> CI 2 
4-» «^ 
4^ 4-» CI 
>*0 ^ ^ 



CI 



CI 

CL4-» W W 

o ^ ^ <o 



1-] o ^ 



CI CI . 
.r- o 



C7» 



CI 

U J5 < 



CI c; -r- _ 

CI VI U VI 

4-» 4-> CJ(4-» 
«l VI 

U VI A. 

*-« 3C 



u 

CI 



h3 c 

CI CI 

o o 

CI 

> CI 



o 

CI a 

Vlft- o 
Ol 

CI CI u 



fO 4-» 4-> •r* ^ 



^ U 

x: jz 
3 r« 



^5 

CI 1- 

5 » 

VI 
CI CI 



2L 



CH*' 

VI U ft. 
<0 CI 

Ol 

C CI 



Ol <W 

CI o on e vt 



"2 



'2 

3: 



•o 

CI 





> 
























1. 






Ol 






4-» 






so 




fo 




CI 

> 
<• 


c 
> 


Is 




x: 

^» 








CI 


CI 








5 




B 








^« 




VI 


OS 


o 




VI 






O 


o 


«o 




tr 












r- VI 


4-> 


4-» 






c 


E= 






1*0 












«o 




CI 


c 




o 




CI CI 


CI 


t 




CI 




JZ 




C7 




> CI 


> 




> 


i. 


c 




u 








u c 












o 


CI 




CI 




CI 


JZ 














> 




VI U 






re 


«A 
















%n 






JZ 


CI 








CI CI 


B 


CI 


(« x= 


u 


C i. 


c 




to 






CI 


CI 




4-» 


i. 


o «o 




Cf 


c 


cr 


4-» 




c: 










> 




i. 


CI 


o 










CI 


Ol VI 


CI JH 


CI 


v» 


4-» 01 


O 


CI 






VI 


c 








o 








1 




CI 


JZ 


CI 






J= 


V» 10 












W 4-» 


w 






o 



3 



4-* Ci C 4-» 

3 



• <-» CI 



> o 
♦> > o 

0 « ft. ^ 
<0 3 CI VI 

1 r— V» *r- 

2<« U 
> Q CI 
GC Ui 43 o 



17 




,3 



o 








c 




If 




o 
















4^ 








m 
















CI 








U VI 








4-» 




4-» 




•o Cl*^ 






o 


01 x: 




3 




4-» «l 




VI 




U C 


« 




CI 






e 




•o ^ 






VI 


CI 




CI 




t. 4-» O 




JZ 




O. <0 4-H 


CI 


4-» 






Ov 







«0 r— 



#c r- 5 4-» 
4^ CI VI x: 

CJ 4-» ^ ^ 

<al c ^* 

fM 4J VI 4>fr 
CI E VI 3 

C7ix;o£cif«io 

C 4-» ^ 2 CI 
. Q. O 
4JVI4-tVlC: •UCI 

. » U 

O «H VI (« CI L. 

2i 4J v>| 4'* « u x: *a 

fO CI 4-1 CI 





3 3t 



51 



CI 

> 



£2 



- ^ CI (O 

4-r^ 4^ 



18 

This matrix illustrates that up to eight categories of questions 
might be addressed in any evaluative.- effort. The matrix shows that many 
specific questions might be addressed in each specific category. The meta- 
evaluator can assess what questions are being addressed, whether they are 
the right ones, and whether they are all the questions that should be 
addressed. 

4. Who will evaluation serve? 

The fourth question concerns the audience for the evaluation? Who 
will be served? What do they need? How will they be served? These are key 
issues regarding both the questions to be addressed and the means of reporting 
back to the audiences. 

Invariably there are multiple audiences that might be served. For 
example, teachers, researchers, acfrninistrators , parents, students, sponsors, 
politicians, publishers, and taxpayers are potentially interested in the • 
results of evaluations of educational innovations. However, these audiences 
are interested in different questions, and require different amounts and 
kinds of information. Hence, evaluation designs need to reflect. the different 
audiences, their different information requirements, and the different reports 
that are required to service them. 

If these matters are left to chance, as is often the case, the 
evaluation may be expected to fail to meet the needs of some of the audiences. 
This is because the reports from an evaluation designed to serve one 
audience likely will not meet the needs of other audiences. 

This was dramatically illustrated in the U. S. Office of Education 

evaluation of the first year of the Title I Program of the Elementary and 

27 

Secondary Education Act. This multi-billion dollar program, designed to 

ERLC 



19 



upgrade educational opportunities for disadvantaged children, was of 
interest to many different audiences. 

Two such audiences were local educators and Congressmen. Their 
interests were quite different however. Local level educators were especially 
concerned about how to make the individual projects succeed. The Congressmen 
wanted to know what the total program was accomplishing. Clearly, no single 
2valuetion st»jdy o** report could <;erve the different needs of these two 
audiences. 

The U.S. Office— being responsible to the Congress for evaluating the 
Title I Program—had to decide on the audiences, questions, design, and 
reports for a national evaluation of Title I. USOE officials did not 
distinguish between different audiences to be served, nor did they plan how 
different information requirements at national and local levels would be met. 
USO^ officials allowed each school district to design its evaluation exclusively 
to serve local information requirements. Due to potential political problems 
with the schools no requirements were placed on the schools to use conmon 
Instruments by which information could be gathered on a uniform basis for 
submission to the Congress. 

Incredible as it may seem, USOE officials assumed that the thousands 
of local school Title I evaluation reports could be aggregated into a single 
report that would respond to interests of the Congress. USOE did develop a 
report that attempted to integrate and aggregate the local school reports, but 
the result was a disaster and an embarrasment to all concerned. 

This illustrates that it is important early in the process of 
designing an evaluation to carefully identify and analyze the information 
needs of the different audiences for the evaluation. This type of audience 
analysis must be used In designing data collecting and reporting activities. 



20 



Careful attention to this area can assist greatly in satisfying criteria 
of relevance, importance, scope, and timeliness for the evaluation. The 
meta-evaluator, then, i^ill do well to assess evaluation designs for their 
attention to the audiences to be served. 

5. Who should do the evaluation? 

This fifth question concerns the agent for ihe evaluation. Should 
educators do their own evaluations? Should they employ evaluation specialists 
and have thfem do it? Should they subcontract to some external evaluation 
company? Should the educators do their own evaluation but engage an external 
auditor to check their work? Or what? 

These questions are complicated but they become even more complicated 
when the dimension of purpose of the evaluation is added. Who should do 
evaluation for decision making? Who should do evaluation that Is intended to 
serve accountability? 

Answers to these questions are Important, because there are different 
costs, benefits, and problems associated with the use of different evaluation 
agents. It may be cheapest to do one's own evaluation, but to do so invariably 
sacrifices the important criteria of objectivity and credibility. Conversely, 
the employment of external evaluation agents enhances objectivity and 
credibility, but it can increase disruption, costs, and threat. The meta^ 
evaluator should check on how the question of evaluation agents has been 
handled, or might be handled, and he should assess alternative consequences 
of the different possible arrangements. 

6. How should the evaluation be conducted? 

The sixth question regarding the conceptualization of evaluation 
concerns the methodology of evaluation. What is the process of evaluation? 



21 

What steps have to be negotiated in the course of doing an evaluation? 
To what extent have sound procedures been worked out for implementing each 
step in the evaluation process? 

Thereare alternative conceptualizations of evaluation and each 
one has its different steps. While most authors do not view the evaluation 
process as linear, they have recommended varying lists of steps that 
evaluators should carry out. Stake^^ has suggested an approach that 
involves describing a program, reporting the description to the audience, 
obtaining and analyzing their judgments, and reporting the analyzed judgments 
back to the audience. 

Michael Scriven has suggested nine steps in his Pathway Comparison 
Model. They are (1) characterizing the program to be evaluated, {2)'clarifyi 
the conclusions wanted, (3) checking for cause and effect relationships, 
(4) making a comprehensive check for consequences, (5) assessing costs, (6) 
identifying and assessing program goals, (7) comparing the program to 
critical competitors, (8) performing a needs assessment as a basis for 
judging the importance of the program, and (9) formulating an overall 
judgment of the program. 

Newton S. Metfessel and William B. Michael,^^ in writing about 

Ralph Tyler's rationale for evaluation, have suggested an eight step 

evaluation process. Their steps are (1) involvement of all interested 

groups, (2) development of broad goals, (3) construction of behavioral 

objectives, (4) development of instruments, (5) collection of data, (6) 

analysis of data, (7) interpretation of the meaning of the findings, and 

(8) formulation of recomnendations. 

*^i 

Malcolm Provus has proposed a five step process. It is (1) clarifying 
the design of a program, (2) assessing its implementation, (3) assessing 



22 

its interim results, (4) assessing Its long term results, and (5) assessing 
its costs and benefits. 

As a final example the Phi Delta Kappa Study Committee on 
Evaluation?^ presented a three step process. It included (1) interacting 
with the audiences to delineate information requirements; (2) collecting, 
organizing, and analyzing the needed information; and (3) interpreting and 
reporting tne findings back to the audience. 

The5:e different conceptions of the evaluation process illustrate that 
evaluators will <lo different things depending on which conceptualization Qf 
evaluation they use. If Scrlven's approach is followed, great attention will 
be given to steps that insure the technical adequacy and inclusion of 3udgments 
in the evaluation; but little concern will be given to interactions (with 
audiences) that are designed to insure the utility of the evaluation reports.' 
Conversely, the other approaches place heavy emphasis on interactions. wiih 
audiences to insure that the obtained information will be used by tthe intended 
audiences. ^ ■j 

V ' • 

The meta-evaluator should identify what evaluation process is being 
followed; examine the implications of the selected process in relation to 
criteria of technical adequacy, utility, and efficiency; and check on the 
provisions for carrying out the evaluation process. Feedback of SMch 
information to evaluators should help them decide whether their design is in 
need of modification or explication. 

7. By what standards should the evaluation be judged? 
The position has already been advanced in this paper that evaluations 
should meet standards of technical adequacy^ utility, and cost/effectiveness. 
These standards were further defined in the form of the eleven criteria of 
external validity, internal validity, reliability, objectivity, relevance. 



, , 23 

scope, Importancct credibility, timeliness, pervasiveness, and efficiency. 
In accordance with this position meta-evaluators should assess the extent 
to which evaluations have been designed to meet thes^ criteria. 

There are a number of considerations in making such checks. What 
priorities do the evalua^ors ass1gn^to each of the^ eleven criteria? What 
priorities do the audiences apply to the different criteria? Are the 
priorities for diffsrent criteria likely to be in conflict? To whstt 
extent is the overall conceptualization of evaluation consistent with the 
standards of adequacy for the evaluation that evaluators and their audiences 
have in mind? 

This concludes the discussion of conceptual problems in evaluation. In 
summary, evaluators and their audiences need to hold in comfnon somef defensible 

conceptualization of evaluation that can guide their collection and use of 

r 

evaluation data. There are alternative conceptualizations that might be 
adopted • 

Meta evaluators are encouraged to check the clarity, common acceptance, and 
adequacy of a particular conceptualization by posing seven questions to the ^ 
evaluators and their audiences. The questions are: , 

1. What is evaluation? 

2. What purposes does it serve? 

3. What questions does it address? / 

4. Who should it serve? 

^5. Who should do it? * 

6. How should they do it? ^ 

7. By what standards nhould their work be judged? 



ERIC 



24 

This section^ included alternative responses that might be given to 
each of these questions. An attempt was made to indicate the implications 
of the different answers for different evaluation outcomes that might be 
achieved. 

' Given this analysis of conceptual problems, we next turn to the second 
* category of problems in evaluation work. These are the socioppli tical 
problems. ; 

Sociopolitical problems 

This problem area reflects that evaluations are carried out in social 
and political milieus. By virtue of this, the evaluator must face many 
problems in dealing with groups and organizations. 

Unless the evaluation design includes provision for dealing effectively 
with the peo.ple who will be involved in and affected by the evaluation, these 
peoplf^ may well cause the evaluation to be subverted or even terminated. As 
ar\y evaluator can testify, sociopolitical prbblems and threats are real; and 
evaluation training programs and textbooks do nbt prepare evaluators to deal 
with these problems. In evaluation it is of utmost importance to , check for 
the existence of potential sociopolitical problems and to plan how they can 
be overcome. 

Hy list includes seven sociopolitical problems. They are the problems 
of Involvemfent, internal communication, internal credibility, external cred- 
ibility, security, protocol, and public relations. Each of these problems is 
described in more detail below. 



25^ 



1. Involvement 

This first sociopolitical problem concerns the involvement of the 
persons on whose cooperation the success of the evaluation depends. ^ 
principle of educational change is that unless persons who will need to 
support the change are involved early, meaningfully and continJously in 
the development of an innovation, they likely will not support the operation 
and use of the innovation. 

This principle applies in evaluation as well. Bettinghaus 
and Miller^^ have pointed this out in their analysis of resistance throughout 
Michigan to the newly developed state accountability system. Their 
explanation is that much of the resistance would not exist if more people 
throuyhouyilichigan had been involved earlier in the design of the Michigan 
Account^ility System. Evaluation and accountability at best are threatening 
concepts\ If persons whose work is to bej/valuated are not involved in 



discussions of criteria by which their work will be judged, methods by 
which data will be supplied, and audiences who will receive the reports, these 
persons can hardly be expected to be supportive of the evaluation. More likely 
they will resist, boycott, or even attempt to subvert the evaluative effort. 

What can the evaluator do to involve persons whose support is required 
if the evaluation is to succeed? One thing he could do is to design the^ 
evaluation work to the last detail then present the design at a meeting 
comprised of persons representing all interested parties. While he could do 
this, ^r.d while evaluators often do it this way, this is just about the worst 
thing they can do. 

Presenting a "canned" design to previously uninitiated but interested 
persons at a large meeting is pregnant with involvement problems* The 
attendees likely will Include acfrninistrators, sponsors, evaluators, and 



26 

teachers. Certainly some of the persons will be reluctant participents, 
and none of them, outside the designer of the evaluation, will have any 
comroitment to the prepared design. Any one person who wants to delay or 
cancel the evaluation task will find it easy at such a meeting to rally 
support for his questions and reservations. The evaluator, on the other 
hand, may find no one in his "corner". So the first checkpoint in regard 
to the involvement problem is that evaluators not plan to orient partici- 
pants 1n<he evaluation through presenting them with a finished evaluation 
design at a large group meeting. 

Instead, evaluators must involve groups In the development of the design 
before it is ever presented in anything like final form. Small advisory 
panels can be established and convened for the purpose of hearing their 
recommendations. Small groups can be engaged in working sessions to provide 
reconmendations regarding such matters as logistics. Much Individual contact 
with Interested persons should be arranged, both face- to- face and via tele- 
phone and mail, especially to get their views of what questions should be 
addressed by the evaluation. Unless there Is some compelling reason for it 
the evaluator should probably avoid holding a large group meeting to review 
the evaluation design; it is preferable to hold several small group meetings. 

The point here is that many Interested persons should be involved 
in developing evaluation designs to win their cooperation. The meta- 
evaluator should examine the evaluation for evidence that persons whose 
support is needed are provided opportunities for real input into the 
evaluation planning. The meta^evaluator should also check for the existence 
of unnecessary situations in which adversaries of the evaluation might be 
given opportunities to cause the evaluation to fail. Accordingly evaluation 



27 



designs and activities should be checked for their provisions against problems 
tiiftt may result from a lack of involvement of interested persons in the 
evaluation or from bad plans for involving people. 
2. Internal Communication 

The second jociopolitlcal probleni Is that of internal comminl cation* 
Evaluations involve many activities Jiat are not routine for persons in the 
system where the evaluation is being conducted. At best these activities are 
disruptive, but they can become intolerable to system personnel if they occur 
as a succession of surprises. Conversely, if system personnel do not under- 
stand their roles in the evaluation, they can't perform them. If they don*t 
perform them, the evaluation can hardly be successful. Also audiences for the 
evaluation can't use evaluative data if they do not know it exists. The point 
of this discussion is that evaluation activities should be supported by some 
system of ongoing internal communication. 

The internal conmuni cation should focus particularly on data collection 
and reporting. Periodically all persons who are involved in data collection 
should be informed about what groups will be 1nvolved»'1n what ways, in provid- 
ing what data, at what times. Figure 2 presents one way of communicating such 
infonnation to interested persons. This figure is a chart that shows who Is 
scheduled to respond to designated data collection instruments for each day 
of some explicit period. Likewise Figure 3 is a similar chart that shows what 
audiences are scheduled to receive what reports on what days. The preparation 
of such charts can be used to inform system participants about their future 
involvement in the evaluation. Of equal importance, the projection of such 
calendars can aid evaluators to identify conflicts and feasibility problems 
in their data collection and reporting schedules. 



28 



C7> 



C 

to 
O 

I/) 
•J 

o 



O 



U 



O 

a. 



O 

a 

1^ 



o 



CSJ 



CM 



O 

CV! 



in 

E 



to 

CSJ 



CM 

CM 



in 
O 



CM 



t 

c: 



u 0) 



CM 



o c: 
x: u 

U l/i •r- 
r— CX 4/1 

f < *u -5 



• 

E o 
c s. 

•O CO v> 

1. 1. 

o u 

r— O) #0 

r- QJ 
<C CX4-» 



CO 



CM 



CO 



10 C 

r- 3 4/1 
r- +-> »f- 



CM 



CM 



"o Q) yi 

O 10 ^ 

£ 10 <U 

4/1 r— O O CX-O 

1. #0 f- r— i/» -O <0 

f-4->*0 OO lOOO 1- 

C 1- -C r— U «0 4/1 

oa)4-> uo) Ci t3 o L. 

E U to 4/1t/) r~ J3a 

T- CO CLOD JQ 

i — -C r— O 53*4- f— a> 

cc«ME ecu <4/io <e 





1 




1. 


o 




u 






C 


> 


•-^ 






o 








cr 


> 


c 










1 O 














#0 


4/1 0) 


o 


>»— 




LO > 


10 




+-> 




#0 


• 


o 





cr 
o 





1 C 






o 








u. 


> 


O 


<o 






•MOO) 


c 






X» t: 






U 0) 






> 


01 4-» 


r— ♦J C 








#0 *r* L. 


X 






U 4-» #0 


a» 




O 01 


i/) 4-» 0) 






U 


#0 r— 




r— 


1/5 C C7» 




<a 


10 


- 0) C 


#0 4/1-0 


u 


s. 




T- t/1 t- 




Q) 


+-> Q) XJ 


U 0) #0 


LO 


C 




0) to ^ 


>> 






OL 4/1 O 








I/) #0 


a. 


• 


• 


• 


• 


CM 






in 



o 

ERIC 




o 

ERIC 



30 

These charts are suggested as devices that meta^valuators can use to 
check for conmunl cation problems. The completed charts can be used to check 
whether the system personnel and evaluators have comron understandings of 
their evaluation responsibilities. The charts can also be used to help the 
evaluators discover feasibility problems in their plans. 

To Insure that Internal coftnuni cation is systematically maintained, 
evaluators can use a number of techniques. They can report at staff meetings. 
They can issue weekly evaluation project newsletters and they can maintain 
advisory boards that represent the system personnel. 

It is Important that evaluators use appropriate means to maintain 
good communication with system personnel. This is necessary both to insure 
their cooperation— which is necessary for the technical adecjuacy of evaluation 
efforts— and to Insure that the evaluation findings will be used. Meta- 
evaluators can provide valuable service to evaluators through checking their 
evaluation plans for the adequacy of provisions for Internal communication. 
3. Internal Credibility 

A third sociopolitical problem concf^AS the internal credibility 
of the evaluation. Particularly this involves the extent that system personnel 
trust the evaluator to be objective and fair in his collection and reporting 
of data. 

A common characteristic of evaluations is that evaluators must often 
collect data from persons at one level of a syst^n and then collate the data 
and report them to persons at the next higher level of the system. For 
example, it is common that data must be collected from teachers for development 
of a school-level report to serve the school principal. This characteristic 
of evaluation causes a natural threat: persons who respond to evaluative 
queries wonder whether they are being evaluated and whether the data they 



31 

provide will be used against them. It Is little wonder that evaluators' 
requests that educators respond to Interviews and questionnaires often are 
met with anxiety. 

To secure the cooperation of potential respondents to evaluation 
instruments, evaluators must clarify how the collected data will be reported 
and used; and the evaluators must establish a climate of mutual trust and 
cooperation. Particularly evaluators must clarify who will receive the 
evaluation reports and whether the reports will be used to evaluate the 
persons who supplied the basic data. The evaluators must say whether or not 
they can guarantee anonymity; if they make such guarantees | they must demonstrate 
how they can live up to their commitments. Above all^the evaluators BHiSt 
constantly demonstrate the highest level of professional integrity. 

If there are problems of Internal credibility, the technical 
adequacy and utility of the evaluation will be threatened. Again a crucial 
question to be addressed in a meta evaluation is whether there are any problems 
of Internal credibility. This can be checked by posing questions to the 
evaluator that he might be asked by potential suppliers of data for the study. 
Cover letters for questionnaires can also be reviewed, and potential evaluation 
respondents can be interviewed. Feedback to the evaluator should identify 
areas that need clarification, contradictions in various corrmuni cations to , 
data suppliers, concerns of the data suppliers, and suggestions for strenthening 
the internal credibility of the study. 
4. External Credibility 

The fourth sociopolitical problem is external credibility. This 
involves whether persons outside the system being evaluated have confidence 
in the objectivity of the evaluators. 



32 

To the extent that evaluators have done a good job with internal 
credibility, they are likely to encounter external credibility problems. 
If people inside the system are comfortable with and confident in the 
evaluator, people outside the systetn may think the evaluator has been 
co-opted. This is because outsiders commonly expect the evaluator to do 
an Independent, objective, hard-hitting assessment of merit and they take 
it for granted that insiders will resist and be anything but confident in 
the evaluator. 

TNe internal credibility/ external credibility dilemma is a 
common a^id difficult one for ev^/luators. however, the technical adequacy 
and utility of the evaluation depends on the evaluator being credible to 
both Insiders and outsiders. The evaluator must, therefore, be alert to 
problems in both areas and he must strive to overcome them. 

5. Security of data 

One of the ways to enhance the internal credibility of the eval- 
uation is through attending to the fifth sociopolitical problem. This is 
the problem of security of data which, of course, concerns whether the obtained 
data are under the complete control of the evaluator. 

To be kept, guarantees of anonymity must be backed by strong 
security measures. Some of these are common sense, such as storing data in 
locked files and strictly controlling the keys to the files. Another effective 
method is to insist that respondentis not place their names on the forms they 
fill out. Also, matrix sampling can be used--as in the case of "National 
Assessment^'-to prevent any person, school, or school district from being 
identified by a particular score. Of course there are limits to the guarantees 



33 

of security that can be upheld as has become evident In the Infamous Watergate 
case. The evaluator should provide reasonable assurances^ he should 
make provisions f|^ir^phold1ng these^ but he must not make promises that 
cannot be met. 

Problems of security can Influence the evalua tors ability to collect data 
and thus affect the technical adequacy of the results. In the long run* If 
security of data Is not maintained, the evaluator will likely encounter 
great reslstence In his attempts to collect data. Again, I urge that 
meta -evalua tors make reasonable checks to uncover any problems relating to 
security of data, for such problems can Incapacitate an evaluation effort. 

6. Protocol * ; 

The sixth problem concerns protocol. One commonly hears that 
school districts and schools maintain standard protocol procedures that 
outsiders are expected to use. Problems In this area may develop when 
evaluators don't find out and use the system's protocol procedures. 

Essentially protocol Involves Interactlpns with the chain of 
command. In some schools outsiders must always get clearance from a teacher's 
or principal's immediate superior before visiting or communicating with the 
teacher or principal about school affairs. Also It Is common for school 
superintendents to clear contacts for outsiders with school board members. 
In some evaluations evaluators are asked not to contact school personnel 
until a school official has formally announced and sanctioned the 
evaluation plans. In extreme cases administrators have been known to 
require that evaluators be accompanied in their visits to school personnel 
by the administrator or his representative. Clearly, there are many 



alternative protocol arraf^^ein^nts that evaluators may be expected to 

honor, ^^-^^^^^^^v 

Such requirements present a dilemma to the evaluator. While It 
Is Inexcusable for evaluators not to find out what protocol expectations exi 
It Is not at all clear on apriorl grounds how they should respond to them. 
If the evaluator doesn't first clear his questionnaires with the school 
principal^ the teachers may not respond to the questionnaires. If^the 
evaluator goes along with an administrator's request that questionnaires be 
administered and collected by the administratoij this may bias. how teachers 
respond to the instruments. If the evaluator allows the administrator tq 
sit in on interviews, a serious question will exist concerning the validity 
of the interview results. Thus the evaluator must dea^ carefully with the 
deceptively simple-appearing problems of protocol. 

7. Public relations 

The seventh and final sociopolitical problem is that of public 
relations. This problem concerns the public's and the news media's 
interest in evaluation work and how evaluators should treat such Interests. 

Evaluations are often of interest to many groups--somet1mes for 
the evaluations* informative aspects and other times for their sensational 
qualities. Thus, reporters frequently seek to learn about the nature and 
findings of evaluation studies. Newspaper articles, press conferences, 
television »*eleases, etc., are common occurences in evaluation work. As a 
consequence evaluators, whether they like it or not, must deal in public 
relations. 

This situation, like so many others, presents the evaluator both 
with opportunities and problems. Cooperation with the news media is a 



35 

desirable means of keeping the public informed about the evaluation activities 
and results. However, reporters are not always respectful of the evaluator's 
concern for controlling what information is publicly difiseminated; hence. 
If they can get it, reporters may publicly report information that the 
evaluator had agreed to reporJL privately to some restricted audience. Also 
reporters may edit slant an evaluator's 

report. If the utility of their findings are not to be jeopardized, 
evaluators must work very carefully with representatives of the news media. 

The posture of this paper is that evaluators should take the 
initiative in che public relations area. They should make contact with 
reporters. They should project a schedule of news releases, and they should 
reach agreements about what information is out-of-bounds for public release. 
Protocol should be established for the release and editing of the evaluative 
information. 

The main problem to be avoided in the public relations area is 
in avoiding it. Meta-evaluators should probe to find out what arrangements 
have been made in the public relations area, and they should critique these 
arrangements for their appropriateness. 

With the public relations problem, the discussion of sociopoliti^ll^ 
problems has been completed. The seven problems in this area were 
involvement, internal communication, internal credibility, external credibility, 
security, protocol, and public relations. These problems remind one that 
evaluations occur in sociopolitical contexts and that evaluators must be 
mindful of this if their work is to be technically adequate and useful. Meta- 
evaluators can help by checking for the existence of sociopolitical problems 
and developing appropriate recommendations. 



36 

Next, we turn to the third category of problems 1n evaluation work. 
These have been labeled contractual and legal problems. 

Contractual /Legal Problems 

This third problem area reflects the fact that evaluations need to be 
covered by working agreements among a number of parties both to insure 
efficient collaboration and to protect the rights of each party. Successful 
evaluation requires that evaluators, sponsors, and program personnel collaborate. 
If this collaboration is to be effective, it needs to be guided! by working 
agreements. If these are to hold^they often need to be in the form of some 
legal instrument such as letters of agreement and contracts. Such legal 
instruments should be reflective of possible disputes that might emerge 
during the evaluation and of the assurances that each party requires in 
regard to these possible disputes. 

One way of conceptualizing contractual and legal problems in evaluation 
is to project items that might be standard in most contracts between an 
external evaluation agent anc some system or sponsor. I have in mind 
eight such contractual items. They are (1) definition of the client/ 
evaluator roles, (2) specification of evaluation products, (3) projection 
of a delivery schedule for evaluation products, (4) authority for editing 
evaluation reports, (5) definition of the limits of access to data that must 
be observed, (6) the release of evaluation reports, (7) differentiation of 
responsibility and authority for conducting evaluation activities, and 
(8) the source and schedule of payments for the evaluation. Satisfactory ' 
performance in these task areas is essential if the evaluation is to be 



' 37 

corKlucted efficiently and If It Is to succeed In meeting standards. of technical 
adequacy and utility. Each of these contractual artd legal problem^ Is, defined 
in more detail below. 

1. Definition of the Client and Evaluator Roles 

Clarifying the roles Involved in evaluation workland Identifying the 
agencies and persons who are responsible for those roles U the first contractual/ 
legal problem. Problems of role clarification are coninon \x\ programs, whether 
they occur within agencies or Involve relationships among several agencies. 
If the evaluation is to be conducted smoothly and if it is to serve its audiences 
well, the roles required for conducting and using the evaluation roust be defined 
and the agencies and persons who will be responsible for these roles roust under- 
stand and accept their roles. Hencet the legal agreements that govern 
evaluation work must clearly define the roles to be implemented. 

Basically, evaluation functions can be grouped according to the 
main roles of sponsor, evaluee, and evaluator. These roles maybe Implemented 
Independently by separate agents, or they may be combined and assigned to 
agents in a variety of ways. .Jhe extreme, but not unusual, case Is when all 
three roles are assigned to one person. This, of cours^, is the Instance 
of self evaluation. A number of questions can be asked to determine whether 
the evaluation roles have been adequately defined;^ 

Concerning the role of sponsor, who commissioned the evaluation; 
and who wi^l pay for it? Why do they want it conducted? What support wiVL_- 
they provide for it? To what extent do they intend to participate in ga'tlffering 
information? Jo what extent will the sponsor's work be a subject of the 
evaluation? What information do they want? Kow will they use it? By what 



38 

- authority have they commissioned the evaluation? These and similar questions 
are appropriate for determining whether the role of sponsor has been clarified 
to the satisfaction of ^11 parties who must enter into an agreement for the 
conduct of an evaluation. 

There, also are a number of questions to be considered fn clarffyfng 
the role of the evaluee. Who's work will be evaluated? What is the nature 
.of this work? Are they bound to cooperate? Have they agreed to cooperate? 
What do they expect to receive froni the evaluation? What do they require as con- 
ditions for conducting the evaluation? How will they participate in the 
evaluation? What is their relationship to the sponsor and the evaluator? Clearly 
it Is important that such questions be answered by the main parties to the 
' evaluation if the evaluee is to play a constructive role in the evaluation. 

A third role is that of evaluator. What group will do the evaluation? 
What is their relationship to the sponsor? To the evaluee? What are their 
qualifications to perform the evaluation? Why have they agreed to conduct the 
evaluation? What services do they expect to perform? What persons will they 
assign to perform this work? What support do they require? General responses 
to these questions provide a basic definition of the evaluation role to be 
served. Of course, the evaluator's role is explicated in the detail of the 
technical evaluation design. 

There are then a number of roles that need to be defined and included 
in the written agreements that govern evaluation. By including these definitions 
in the legal instruments that govern evaluation there is a basis for allocating 
specific areas of responsibility and authority^in the evaluation work. Placing 
agreements about roles in the evaluation contract gives assurances and 
safeguards concerning collaboration among the various groups that must support 
the evaluation. 

ERLC 



39 



2. Evaluation Products 

The second contractual/legal item concerns the products to be 
produced by the evaluation. Just as program personnel should clarify their 
objectives, so should evaluators specify the evaluation outcotnes to be produced. 

Basically evaluation outcomes refer to the reports to be prepared. 
How many reports are to be produced? What are their content specifications? 
How will they be disseminated? Who will usW the reports? How will they use 
the reports? How will the quality of the reports be assessed? Generally, it 
Is desirable that the different parties to an evaluation reach agreements early 
concerning the evaluation products to be produced. 

3* The Delivery Schedule 

Related to the evaluation products Is a third contractual /legal Item. 
This concerns the delivery schedule for the specified evaluation products. 

If the evaluation reports are to be useful they piust be timely. 
Hence it is important to determine in advance when the evaluation reports will 
be needed and to reach agreements about whether the reports can be produced on 
such a schedule. 

Attempts to reach such agreements often reveal potential timing 
problems in the evaluatior.. To meet the sponsor's time table, the evaluator 
often would have to sacrifice the quality of his work. But meeting his own 
qualitative specifications would often prevent the evaluator from producing 
timely reports. Frequently evaluators and sponsors must compromise concerning 
technical and time requirOTents in order to insure that the evaluation will 
achieve a reasonable balance of technical adequacy and timeliness. It is 
best that such compromises be effected early in the evaluation work. For 
this reason the tirtiing of evaluation reports should be worked out and included 

ERIC 



40 

as a specific item in the statement of agreement that will govern the evaluation 
work, 

4. Editing Evaluation Reports 

The fourth contractual/legal item concerns the editing of evaluation 
! reports. Basically, this concerns who has authority for final editing of eval- 
uation reports, but it also concerns the need for checks and balances to insure 
that reports contain accurate information. Evaluators, sponsors and evaluees 
have legitimate concerns regarding the editing of evaluation reports. 

The evaluator needs assurances that he has the ultimate authority in 
determining the contents of the reports that will carry his name. There are 
all too many instances of evaluation reports being edited and released by spon- 
sors, without first getting the approval of the evaluator. It Is not proper 
for sponsors to revise evaluation reports so they convey a different (usually 
more positive or negative) meaning than that presented by the evaluator. It 
is proper and often a necessary protection that evaluators require an advance 
written agreement that they will retain final authority regarding the editing 
of their reports. 

But, sponsors and evaluees also deserve certain assurances regarding 
the editing of evaluation reports. All evaluation procedures are subject to 
error. Therefore all evaluation reports potentially contain misinforinatlon. 
Moreover the reporting of false results can be unjustly damaging. Hence, it 
is reasonable that sponsors and evaluees require that evaluation designs con* 
tain reasonable checks and balances to guarantee the accuracy of evaluation 
reports before they are released, 

A suggested contractual provision covering editing of evaluation 

ERIC 



41 

reports Is as follows; 

a. The evaluator will have final authority for editing eval- 
uation reports. 

b. The evaluator will provide a preliminary draft of his report 
to designated representatives of the evaluee and sponsor for their review 
and reactions. 

c. These representatives will be given a specified number of days 
within which to file a written reaction to the report. 

d. If received prior to the deadline the evaluator will consider 
the written reactions in the preparation of the final report. 

These points are intended to guarantee final editing authority to 
the evaluator, but to provide the evaluee and sponsor with a means of raising 
questions about the accuracy of preliminary reports. The point is that eval- 
uations involve potential disput=*s over editing and accuracy that can be 
minimized through the reaching of advance written agreements. 

5. Access to Data 

The fifth contractual /legal item involves the access to data. 
Generally evaluators must gather existing data from files and new data from 
system personnel. This situation presents potential problems to evaluees 
and sponsors as well as evaluators. 

The evaluees and sponsors have a special concern for protecting the 
rights of system personnel and for maintaining good relationships with them. 
Certain data in system files are confidential. The system administrators 
need to guard the confidentiality of this information or reach special agree* 
ments about its use in the evaluation. Also, system personnel are not auto- 
matically willing to submit to Interviews or to respond to lengthy questionnaires. 



ERIC 



i 



42 

Nor, based on their contracts, are they bound to do so. If their cooperation 
Is to be obtained, it must be requested in advance, and agreements with the 
system personnel need to be worked out. Hence the evaluees and sponsors have 
an Interest in writing advance agreements about access to data* 

Of course this is a crucial item as far as the evaluator is con- 
cerned. He can't conduct his evaluation unless he can get the data he needs. 
Hence, he also needs to have advance agreements concerning what information 
he can expect to get from system .files, and concerning what new data he 
can obtain. If the evaluator can't get such assurances in advance, his work 
is in jeopardy, and he might just as well cancel the evaluation before it 
starts. 

6. Release of Reports 

The sixth contractual item concerns the release of reports. Basic- 
ally this is a matter of who will release the reports and what audiences may 
receive them, 

A potential problem exists in the possibility that the evaluations - 
may be released by the sponsor only if they match his predilections and serve 
his ulterior motives. This, of course, is a biased use of evaluation and is 
to be avoided by professional evaluators. Instead they should insist that 
their reports be provided to the prespecified audiences regardless of the nature 
of the findings. If there is some doubt about whether the sponsor will release 
the report to the prespecified audience, the evaluator, in writing, should re- 
serve the right to do so himself. 

A related problem is in determining what audiences should rece'vc* 
what reports. In some cases, for example in evaluating the early developmental 



ERIC 



43 

Mork of a na# program » It Is entirely appropriate that the developers engage an 
evaluator to provide thew with private feedback for their own use In Improving 
their work. If the evaluator and developers agree to this condition In advance 
It would be Inappropriate for the evaluator to release his report to the pub- 
lic. In other cases the evaluator and the sponsor might appropriately agree 
that a report on the overall merit of a program be developed and released to 
the public. In such a case. If the sponsor didn't like the results and decided 
not to make them public, the evaluator should release the results. Otherwise 
his Integrity and the credibility of his work will be justifiably threatened. 

It Is patently evident that evaluators and their sponsors should 
agree In advance regarding what reports should be released to what audiences. 
Not all reports should be released to all audiences. But reports should not 
be selectively released based on the nature of the findings. Both evaluators 
and their sponsors need assurances In this matter. It Is therefore urged that 
their advance written agreements contain an Item pertaining to the release of 
reports. 

7. Responsibility and Authority 

The seventh contractual item concerns responsibility and authority 
for conduct of the evaluation. A prior contractual item concerned the defin- 
ition of roles for the evaluator, the evaluee. and the sponsor. This item 
concerning responsibility and authority emphasizes that specific work needs to 
be performed by each group in the conduct of the evaluation and that specific 
agre^nents about work assignments should be worked out in advance. Including 
this item in the contract is intended to insure that the rights of all parties 
will be protected and that the evaluation design will be Implemented. 



44 

Any evaluator knows that he can't do everything that Is required to 
Implement an evaluation. Cooperation Is required from many different groups. 
A*i1n1stf>tors must secure the cooperation of their staffs; and teachers, students, 
a*i1n1strators,,coiTinun1ty personnel, and others often are asked to provide In- 
formation. Often, teachers are engaged to administer tests to their students. 
In short the evaluator Is dependent on receiving help from many groijps In 
carrying out the evaluation design. ^ ^ 

But, evaluators don^t have automatic authority to assign res pons Ibll- 
Itles to the various groups on whose cooperation the success of the evaluation 
depends. They either need to define and work from explicit agreements concern- 
ing who will do what, or they need to depend on their wits and the good will 
of the people with whom they intend to work. By far the best practice Is to 
work out advance written agreements that delineate areas of authority and 
responsibility for all parties who will be Involved In the evaluation. 

8. Finances 

The eighth and final contractual Item concerns finances for tlie 
evaluation. Who will pay for the evaluation? How much money has been budgeted 
for !t? How may this money be spent? What Is the schedule of payments? What 
are the conditions for payment? How is the schedule of payments correlated 
with the delivery schedule for evaluation reports. The matter of finances Is, 
of course, the most coninon one in evaluation contracts. 

Advance agreements regarding finances should be written to protect 
both the sponsor and the evaluator. The sponsor should insure that payments 
will be made only if the evaluation objectives are achieved. The evaluator 
should be assured that funds will not be cut off midway in the evaluation due 
to the nature (as opposed to quality) of the results that are produced. Hence 

o 

ERIC 



45 

the evaluator and the sponsor should agree in advance to a schedule of pay- 
ments that is dependent only on the evaluator meeting the mutual ly-agreed-upon 
product specifications. 

This concludes the discussion of contractual/legal problems. Basic- 
ally all parties Involved in an evaluation require protection and assurances. 
Items to consider In providing the needed protection and assurances are: 
(1) role definitions, (2) evaluation outcomes* (3) a delivery schedule of eval- 
uation reports, (4) editing of reports* (5) access to data, (6) release of 
reports* (7) delineation of authority and (8) finances for the evaluation. The 
suggestion here is that these items be included in an advance contract (signed 
by the appropriate parties) to govern the evaluation. The me ta-€valua tor's 
concern here should be to ascertain whether the evaluation is covered by a set 
of written agreements that would adequately forestall potential contractual 
and legal problems in the evaluation. 

So far this discussion of evaluation problems has considered conceptual* 
sociopolitical, and contractual /legal problems. But* little has been said 
about technical problems, which are the ones that have received the most atten- 
tion in the formal literature of evaluation. By considering technical problems 
fourth in the discussion of six classes of evaluation problems* the point is 
hopefully being made that technical problems are one important type of problem 
the evaluator must face, but by no means the only type. 

Technical Problems 

Nevertheless, evaluators must be prepared to cope with a wide range of 
difficult technical problems. Nine such problems will be considered in this 
section. Attention to these nine items should assist evaluators to convert an 
abstract evaluation plan to a detailed technical plan for carrying out the 

ERIC 



46 



evaluation work. 

1. Objectives and Variables 

The first technical problem concerns the Identification of the 
variables to be assessed. The problem here Is twofold. First, there are 
potentially many variables that might be included In any study, and the 
evaluator has the difficult task of Identifying and choosing among them. 
Second, It Is usually not possible to choose and operationally define all 
the variables before the study starts; hence, the evaluator often must 
continually add new variables to his evaluation design. Meta-evaluators 
should check evaluation designs for their Inclusion of variables that meet 
conditions of relevance, scope, and Importance; and the meta-evaluators 
should check designs for their flexibility and provisions for adding new 
variables through the course of the study. v 

There are a number of ways of dealing with the problem of Identifying 
evaluative variables. The classic way Is to get program personnel to define 
their objectives In behavioral terms. This focuses the evaluation on what the 
program personnel perceive to be desirable outcomes. Devices that are of 
assistance In defining objectives Include the Bloom and Krathwohl taxonomies 
of cognitive and affective objectlves.^^*^^ Also an enormously useful article by 

36 

Hetfessel and Michael presents a long list of behavioral Indicators for use In 
evaluation studies. 

. This focus on objectives has served well in countless studies, but it 
yields variables that are limited in scope. For example, if evaluators focus 
exclusively ci those variables that relate to tlk^developers* objectives, other 
Important outcomes and side effects may be missed. Also, variables such as cost, 



4 



47 

readability of materials, staff time in a program, and socioeconomic background 

of students may be ignored. Hence, there is a need for a broader framework of 

variables than that afforded in the concept of behavioral objectives. 

A number of broader perspectives have been suggested in the literature. 

Clark and Guba^^ have suggested a range of variables that they believe should be 

considered in assessing various change process activities. Hammond^^ has 

proposed his EPIC cube as a means of choosing variables that reflect student 

.behavior, institutional involvement, and curricular elements.^ In a forthcoming 

book, Hammond will present an algorithm based on facet analysis wherein evaluators 

and program personnel may systematically assign priorities to tha^otenti^l 

39 

variables in the EPIC cube. Stake in his Countenance Model has suggested a 
framework that interrelates a ntecedent conditions , process and outcomes with 
program persons' Intents and evaluators' observations . These perspectives 
illustrate that the views of what variables should , be incorporated in evaluation 
have broadened greatly from the early ideas "that evaluations should focus 
exclusively on outcomes that relate to given objectives. 
2. The Investigatory Framework 

The second technical problem concerns what investigatory framework 
should guide the evaluation. An investigatory framework specifies the 
conditions under which data are to be gathered and reported; and the assumptions 
to be made in interpreting the findings. In all evaluation studies evaluators 
must choose either implicitly or explicitly among a number of alternative 
Investigatory frameworks, e.g., experimental design, survey, case study, and 
site visitation. 

No one investigatory framework is superior in all cases. None is 
best in serving the criteria of technical adequacy, utility, and efficiency. 
Q Also different frameworks work differentially well under different sets of 

ERIC ^ 



48 



feasibility constraints* Thus evaluators may choose different investigatory 
frameworks depending on the evaluative purposes to be served, the priorities 
assigned to the different criteria for judging evaluation reports, and the 
unique conditions uhder which evaluations are to be conducted. The task is to 
choose the framework that will optimize the quality and use of results under 
realistic constraints. 

Whereas true experimental design is theoretically the best way of 
determining cause and effect relationships (through its provisions for internal 
and e)(temal validity), it is often not feasible to conduct true experiments. 
This is because it is frequently impossible to control treatment variables and 
the assignment of the experimental units to the treatment conditions. For 
example, one would not use experimentation to assess the effects of Sputnik I 
on subsequent U.S. educational policy. Neither would one say that it is not 
appropriate to make post hoc evaluative interpretations about such linkages. 
Also--regarding the matters of relevance, scope, and timeliness--experimental 
design often would not assess the right variables or provide timely feedback 
for decision making. This is especially true when the concern is to conduct 
needs assessments to assist in formulating goals, or to conduct process eval- 
uations to assist in implementing a project. Experimental design should be 
used when it addresses the questions of interest and when it is practicable^ 
to use it; otherwise some alternative framework should be chosen. 

The literature presents a number of valuable alternatives to 

experimental design. Campbell and Stanley^^ have discussed quasi-experimental 

riesign. O'Keefe^^ has suggested a comprehensive methodology for field-based 

42 

case studies. Scriven has introduced "Goal-Free Evaluation" and more 



49 

recently "Modus Operandi Analysis."*^ Reinhard^^ has explained "Advocate 
Team Methodology," and Wolf^^ has explicated the "advocapy-aciversary" model. 
These examples illustrate that evaluators arc not botind to use any single 
investigatory framework. 

The meta-evaluator can perform a valuable service in helping an 
evaluator identify and assess alternative investigatory frameworks. To do. 
this the purposes (i.e., decision making or accountability) and the foci of 
the evaluation study (e.g., goals, design, process and/or results) need to be 
known. Also it is necessary to determine any feasibility constraints. 
Subsequently, the meta-evaluator can suggest and assess frameworks that are 
potentially responsive to the given conditions, and the evaluator can choose 
that framework that best optimizes the given conditions. 

3. Instrumentation 

The third technical problem concerns instrumentation. Considering 
the purposes of the study, which of the available data gathering instruments 
and techniques are most appropriate? Moreover, are any of them adequate? 
Must instr'uments be specially developed to serve the purpose of this stud|y? 
Is it feasible to develop new Instruments? If it is, what sacrifices will have 
to be made regarding the technical adequacy of the instruments? These ques- 
tions illustrate coirmon measurement problems that evaluators encounter in 
their evaluation work. 

The evaluator can, of course, get help from the literature in 

identifying potentially useful instruments. The Buro's Mental Measurements 
46 

Yearbooks catalog and assess many instruments, especially in the cogni- 

47 

tive domain. A recent book by Miles, Lake, and Earle 

identifies and discusses a number of instruments in the affective domain. 

Glass has compiled a set of fugutive instruments that have been developed 



50 

and used In federal proJects»and one can Identify many different Instruments by 
checking through completed doctoral dissertations. So the evaluator can be 
greatly assisted In choosing Instruments by surveying the relevant llteratijre. 

Even then, however, he may not find appropriate Instruments. In 
this case he Is often faced with a dilemma. Should he choose an inappropriate 
Instrument that has been validated? Should he develop and use new Instruments 
that respond directly to the purposes of the study, when there Is no possibility 
of validating the Instruments before they are used? The position In tills paper 
is that the latter course of action often Is the only feasible one. In any 
case, problems of instrumentation are key concerns in assessing evaluation 
designs. . • 

4. Sampling 

The four*-h technical problem in evaluation studies concerns sanpling. 
What's the population? Is an Inference to be made to this population? How 
large a sample Is needed? Can a random sample be drawn? .Should the sample 
be stratified according to certain classification variables? Can the experi** 
mental units be randomly assigned to program conditions? How much testing time 
can be expected from each sampled element? Is examinee sampling necessary or 
would matrix sampling be better? Is matrix sampling feasible? If random selection 
and assignment of experiiRental units are not feasible, what can be done to guard 
against bias- in the sample? 

These questions denote a number of sampling-related difficulties that 

often are encountered in evaluation work. Even under the best of circumstances 

inference to a population based on the performance of a random sample is 

48 

logically not possible. As Campbell and Stanley have pointed out generalization 
always turns out to Involve extrapolations into a realm not represented in one's 
sample." Evaluators, however, are rarely able tc even draw a random sample, 
so their problems of extrapolation are even worse. The least they can-do-1t^to— — 



consider and respond as best they can to questions such as those posed ' 
above. . . • f , 

5. DataxGatherlng - 

It Is one thing to choose Instruments and samples, but It 1s quite 
another actually^ to gather the data. Often the evaluator roust rely on a 
number of persons In addition to himself for the gathering af data. For ;v 
example, teachers must' oftLn beWelled on to administer t^sts.to studer^ts. 
This fifth technical problem of data gathering presents a number of difficulties 
tp which the evaluator must be sensitive «eind responsive. . ^/ 

A number of questions point up the d1ff1cult1es"ln"data gathering. 
Who will deliver instruments to the data (fathering sites? What Is to prevent 
teachers from teachlijg to the tests? How can the cdoperatflon of test administrators 
and respondents be secured? What can be done to insure motivation of the 
respondents and prevent cheating? In what settings jwi 11 the respondents work? 
Who will administer the instruments? Who will monitor the data gathering 
sessions? How will standardization of data gathering conditions be assured? 
Unless evaluators consider and respond to such questions, their evaluations may 
fail due to poor implementation of the data gathering plan. ^ • . 

6. Data Storage and Retrieval 

The sixth technical problem concerns the storage and retrieval of 
data. Once the data have been gathered it Is necessary to check them for ^ ' 
accuracy, to code them properljr, and .to store them for future use. ^ 

1 Meta*€valuators should check whether provisions have been n^de to 

accomplish these tasks. While the tasks are fairly routine, failure to deal . 

effectively with them can destroy the effectiveness and. efficiency of the 

/ 

evaluative effort. > ' 



52 

7. ' Data Analysis 

The seventh technical problem concerns the analysis of data. Both 
statistical and content analysis are Involved. The mcta*eva1uator should 
ascertain what plans have been made to analyze the data that will be obtained. 
He should check the plans for their appropriateness In responding to the study 
questions. He also should check whether assujnpticns required for the data 
analysis will be met by the data. Lastly, he should assess the provisions 
that have been made for performing the actual data analysis. ' 

Mar\y texts are available to assist in the analysis of data. Those 
prepared by Glass and Stanley/^ Winer, Guilford, and Siegel^^ are viewed 
in this paper as especially useful. 

8. Reporting , 

The eighth problem concerns the preparation of evaluation reports. 
What different reports will be required for the different audiences? How 
will they be organized? What tables will they include? How long should 
they be? How will they be presented and interpreted to the audience? 

This problem area is a reminder that evaluations must be informative. 

Doing an outstanding job of data collectioV and analysis will fall short of 

meeting the purposes of an evaluation if the results are not coitununicated 

* 

effectively to the designated audiences. Therefore, meta-evaluat^rs should 
ascertain whethpr appropriate reports will J^e prepared and whether appropriate 
communication techniques will be used to interpret the findings to the 
prespecif led audiences. 

9. Sunwarizing the Technical Adequacy of the Design 

The ninth and final problem Involves' summarizing the technical 
adequacy of the evaluation plan. Have the evaluation variables been 
identified and are they the Tight ones? iias a relevant and feasible investigatory 



ERIC 



53 



framework been chosen? Has this framen^oric been fleshed out In the form of 
appropriate Instruinents, sampling techniques, and analysis procedures? Have 
sufficient provisions been made for collecting, storing, retrieving, and 
reporting the Information? Overall, will the evaluation yield results 
that are reliable, valid, objective, and useable? 

If the evaluator can sunmarlze his evaluation design through 
answering affirmatively to the above questions, he can be sure that his 
technical plan Is sound. If he cannot, he should review and revise his 
technical plan. While technical problems are not the only problems that 
evaluators roust address they certainly are crucial ones. 

Management Problems 

So far It has been noted that evaluation problems Are conceptual, socio- 
political, contractual/legal, and technical in nature. Next, a fifth area 
will be considered. In this area It Is emphasized that evaluation studies 
must be properly managed, and that evaluators must cope 'fiith a number of crucial 
management problems. Specifically, ten management problems will be Introduced 
and discussed. It Is to be noted that evaluatoi^s should not only deal with 
these problems, they should do sd In such a way as to enhance the ability of 
the parent agency to improve Its long-range capabilities to mar^ge evaluation 
studies. 

1. The Organizational Mechanism 

The first management problem concerns the organizational mechanism 
for the evaluation. This is a matter of determining what organizational unit 
win be responsible for the evaluation. 

Alternative possibilities exist. An in-house office of evaluation 
Bight be assigned to do the evaluation. An external evaluation group might 
be commissioned. . A consortium agencies might set up an evaluation center 



54 



that they jointly support and this center might be assigned to do the work. 
The program staff » themselves, might perform a self evaluation; or tl\ey 
might do it themselves but engage an external auditor periodically to 
assess their work. 

Each of these approaches has been applied in evaluating educational 
programs. These organizational alternatives have differing costs and benefits. 
The meta-evaluator should identify what alternative has been chosen and 
compare its costs and benefits with those of alternative organizational 
arrangements. 

2. Organizational Location of the evaluation 

The second management problem concerns where the evaluation is 
located within the organization. Will the cvaluators report directly to the 
executive officer of the agency in which the program is housed? Will the 
evaluator also be able to report dfrectly to staff members at lower levels 
of the system? Will he be enabled to communicate directly with members of the 
agency's policy groups? In general through what channels may the evaluator. 
influence policy formulation and administrative decision making? 

This is a crucial issue that affects particularly the pervasiveness, 
credibility^ and timeliness of evaluation work. If reports are submitted only 
through the chief executive officer, other members of the system may doubt 
the credibility of the reports. On the other hand, if reports are sent 
directly to persons at 2ll !evels of the system, the chief decision makers may 
feel greatly threatened by the evaluation, especially if the evaluator ^interacts 
directly with members of the agency's policy board. Mpreover, if reports must 
pass through the chief executive's office, the reports may fail to meet criteria 
of pervasiveness and timeliness. An illustration of this is when individual 
student diagnostic records are sent by a testing company to the central 



55 

administration of a school district and only weeks later reach the teacher 
who co^ld make constructive use of the results. Clearly the matters of 
organizational location and reporting channels are crucial concerns 
In any evaluation study. 

3. Policies and Procedu'^es 

A third management concern is that of policies and procedures which 

govern evaluation activities. The evaluator needs to find out about existing 

policies and procedures that will affect or govern his work. Also he should 

be alert to opportunities that he might use to help the agency that 

coninissioned the evaluation to develop and adopt policies and procedures to 

govern its future evaluation work. 

Such policies and procedures might include a number of items. 

Delineation of evaluation roles and assignment of responsibilities for those 

roles are fundamental concerns. A conceptual scheme to guide the agency's 

evaluation work might also be provided, as was done in Michigan through 

legislating a six-step accountability model. Of course such a statement of 

policies and procedures should specify how the evaluation work is to be 

financed. Examples of formal manuals of evaluation policies and procedures 

53 

are those adopted by the Saginaw, Michigan Public Schools and the Ohio State 

54 

University College of Education. 

4. Staffing Problems 

The fourth administrative problem concerns the staffing of the 
evaluation work- Who will have overall responsibility for the work? Who 
win be assigned the operational responsibility? What other roles are to be 
manned? Who will be assigned to these roles? What recruitment of personnel 
roust be done? Who wii!l be considered? What criteria will be used to assess 



ERIC 



56 

their qualifications? Who will choose them? Quite obviously evaluations are 
often team efforts and it Is crucial to clzjse qualified personnel to perform 
the evaluations. 

Beyond meeting the Immediate evaluation requirements, the staffing 
of an evaluation sometimes provides significant o.oportunities for upgrading 
the long-range evaluation capability of the agency whose program is being 
assessed. Evaluation projects are excellent settings within which to train 
evaluators. If persons are recruited partially because they want to become 
evaluators in the agency whose program is being evaluated, they can be trained 
through their imnoediate evaluation assignment and subsequently be kept on by 
the agency as evaluators. Illustrations of this are that Dr. Jerry Walker 
(who heads evaluation in the Ohio State University Center for Vocational 
and Technical Education), Dr. Howard Merrlman (a prominent evaluator in the 
Columbus, Ohio Public Schools), and Mr. Jerry Baker (Director of Evaluation 
in the Saginaw, Michigan Public Schools) were recruited, trained, and later 
employed on a continuing basis exactly in this way. 

Staffing is obviously a key problem in the management of evaluation 
work. The quality of the evaluation will largely depend on the competence 
and motivation of the staff. At the same time there is often an opportunity 
to upgrade an agency's long-term evaluation capability ithrough judicious 
recruitment and training of persons who may want to stay on in the agency in 
the capacity of evaluator after the initial evaluation assignment has been 
completed. The meta-evaluator should carefully assess the evaluator's 
provisions for meetino his staffing' needs and serving opportunities for 
longer-range staffing payoff. 



57 



5. Facilities 

The fifth manageinent problem In evaluation concerns the facilities 
needed to support the evaluation. What office space, equipment, and materials 
will be needed to support the evaluation? What will be available? Answers 
to these questions can affect the ease with which evaluations are carried out 
and even their success. Thus evaluators should be sure that their management 
plans are complete in their provisions of the necessary facilities. 

6. Data Gathering Schedule 

The sixth management problem to be identified Involves the scheduling 
of data collection activities. What samples of persons are to respond to what 
instruments? When are they to respond? Is this schedule reasonable, and is 
it acceptable to the respondents? When will the instruments and administration 
arrangements need to be finalized? Will the instruments be ready when they are 
needed? Will students still be in school when the administrations are to occur? 
Are there any potentially disastrous conflicts between the data gathering schedule 
and other events in the program to be evaluated? Overall, is the data gathering 
schedule complete and feasible? 

The above questions illustrate difficulties that do plague evaluation 
studies. In one case a government-sponsored $250,000 evaluation study of 
programs for disadvantaged students actually was scheduled so that student data 
had to be gathered in July and August. The evaluators, who were from outside 
the field of education, had forgotten that most students do not attend school 
in the summer. In another situation an evaluator planned to administer ten 
different instruments to the same group of principals during a three-week period. 
While it is important in many studies to ascertain the school principals* 
perceptions, bombarding them with questionnaires will neither elicit good will 
lor cooperation. As a final example, an evaluator scheduled observations of 



58 

teachers during a week when they were administering state tests. This would have 
been fine if the purpose of the study had been to determine teacher competence 
In test administration, but It certainly was a poor time to assess their use 
of some new curriculum. These examples argue that meta-evaluators should 
pay attention to the appropriateness of the data gathering schedule. 

7. Reporting Schedule 

The seventh management problem also concerns scheduling, but in this 
case the scheduling of reports. What reports will be provided, to what audiences, 
according to what schedule? Meta-evaluators should check reporting, schedules 
for their completeness in these respects and for their potential for 
communicating effectively to the prespecified audiences. Also it is important 
that such schedules be checked for their feasibility. The scheduling of reports 
bears directly on how useful the reports will be to the designated audiences. 

8. Training 

The eighth management problem in evaluation concerns training. As 
mentioned previously in this paper, evaluation is largely a team activity, 
and the evaluation team must often depend on the cooperation of system personnel 
in conducting the evaluation. If the various persons are to perform their 
roles effectively, they often need special evaluation training. Hence, 
evaluators should be prepared to meet such training requirements. 

In most situations the training should be both general and specific. 
The specific training is needed for the performance of specific evaluation 
tasks, e.g. the administration of a particular test or interview, or the coding 
of a particular set of data. However, it is also desirable to give training 
in the general principles of evaluation. Such training assists persons to 
understand their particular roles; it provides them with general guidelines for 
O making specific decisions in the course of implementing their role; and it 



59 ^ 

Improves their overall ability to perform future evaluations. Thus, training 
activities within evaluation studies should prepare p«2rsons to perform their 
particular assignments, but It should also present thm with opportunities for 
upgrading their general understanding of evaluation. 

A variety of approaches to training within evaluation studies can be 
applied. Blaine Horthen, Director of Evaluation in the Northwest Regional 
Educational Laboratory, runs periodic sack lunch seminars that focus on topics 
selected by his staff. The Columbus Schools Department of Evaluation at one 
time supported two full-time persons whose primary assignment was to continually 
provide consultation to existing evaluation staff and inservice training in 
evaluation for ddministrators , teachers, and new evaluation staff members. 
Several agencies have engaged external review panels ito study their evaluation 
operations and provide training based on the analyses,. The Western Michigan 
University Evaluation Center frequently invites evaluators to present their 
work to the Center staff, whereupon the Center's staff menders critique 
the work. (This is especially good because both parties gain from the exchange 
of Information and discussion and neither charges the other.) Also, NIE» 
USOE, and AERA have sponsored the development of a large number of evaluaticji 
training packages. These cases illustrate that different means can be 
found to conduct needed training within evaluation studies. 

The content for such training can be highly variable. Considerations 

in determining what training should be provided include who will be trained, 

what their assignments are, what they want and need to know, how they will 

use evaluation in the future, and what opportunities exist for providing the 

training. A good source of information about the content for evaluation 

55 

training programs is a doctoral dissertation by Darrell K. Root on the 
topic of the differential evaluation training needs of administrators and 




evaluators. 



60 



Overall, training is a key area In evaluation work. It is potentially 
very cost/effective since it enhances the ability of persons to implement 
their specific evaluation assignments; and it uses training opportunities to 
prepare these same persons for future evaluation work. 

9. Installation of Evaluation 

The ninth management concern in evaluation is more an opportunity 
than a problem. This concerns the matter of using specific evaluations as 
a means of installing systematic evaluation in a system. The position ^n 
this paper is that evaluators should be alert to such opportunities and 
capitalize on them whenever possible. In this way evaluators can aid the 
systems that house the programs being evaluated to increase their capacities 
to evaluate their own activities. 

This is a crucial need in education. There never will be sufficient 

evaluation companies to perform all the needed evaluation. In any case much 

of the needed evaluation should not be done by external agents since they are 

56 

sometimes too threatening and too expensive. But, as Adams discovered when 
he surveyed all the school districts In Michigan, few educational agencies 
have their own evaluation capabilities. Thus, there is a need to aid educational 
agencies to develop their own systems of evaluation. 

A standard practice of the Ohio State University Evaluation Center was 
to use evaluation service contracts as a means of assisting agencies to develop 
their own evaluation systems. Noteable examples are evaluation projects 
performed for the Columbus, Ohio and Saginaw* Michigan Public Schools. In both 
cases the school districts had no evaluation capability, had encountered 
requirements to evaluate their federally-supported projects, and engaged the 
Ohio State University Evaluation Center to conduct the evaluations. That 
Center contracted both to conduct the needed evaluations and to develop 



61 

evaluation departments for the school districts. 

Both purposes were served through a conmon approach. The evaluation 
effort was staffed with teachers from the two school districts who declared 
Interests in becoming system evaluators and who gave promise of becoming 
good evaluators. These teachers were enrolled in graduate programs in 
evaluation and were provided field-based training In evaluation. Of course» 
that training revolved around the work assignments in the evaluation projects. 
At the end of the evaluation projects the Columbus and Saginaw personnel, 
now with graduate training and degrees in evaluation, returned to their 
school system to man new departments of evaluation. 

The continued operation and the achievements of both departments 

attest to ne power of this approach. The Saginaw, Michigan Department of 

Evaluation has been rated by the Michigan Department of Education as a model 

57 

evaluation system. The School Profile developed by the Columbus Schools 
Department of Evaluation has been adopted nationally by a number of school 
districts. Of course, the achievements of Dr. Howard Merriman (recent Vice 
President of the American Educational Research Association's evaluation 
division), who was one of the Columbus teachers chosen to work on the Ohio 
State contract with Columbus, dramatically illustrates that school districts 
may have potentially outstanding evaluators in their own teaching and 
administrative ranks. 

The position in this paper is that special evaluation projects 
should be viewed as potential opportunities for upgrading an agency's 
evaluation capability. Meta-evaluators should ascertain whether evaluation 
staffs have sought out and responded to such opportunities. 

ERIC 



62 

10. Budget for the Evaluation 

The tenth and final management Item Involves the budget. Is there 
one? Does it reflect the evaluation design? Is it adequate? Does it have 
sufficient flexibility? Will it be monitored appropriately? While these are 
obvious questions, it is surprising how often gramliose evaluation plans are 
not accompanied by supporting budgets. It has become the habit of the author, 
when evaluating evaluation plans, to first review the budget for evaluation. 
If none exists, it matters little how good the technical plans is, for it will 
not be possible to implement it. If a budget does exist, it clearly needs to 
be checked for its sufficiency. 

This concludes my discussion of management problems in evaluation. Hopefully 
the ten management items that were discussed will prove useful to evaluators as 
they review their plans for managing evaluation activities. The position in 
this paper has been that evaluation efforts should be managed both to achieve 
specific evaluation objectives as efficiently as possible, and to help the 
agencies involved in the evaluation to upgrade their Internal evaluation 
capabilities. 

Moral, Ethical and Utility Considerations 

The final class of evaluation problems involves moral, ethical and 
utility questions* Evaluations are not merely technical activities, they 
are performed to serve some socially valuable purpose. Determining the 
purpose to be served Inevitably raises questions about what values should 
be reflected in the evaluation. Deciding on value bases also poses 
ethical conflicts for the evaluator. Also, as emphasized before in this 



ERLC 



63 

paper, the evaluator must be concerned with what practical uses his evaluative 
reports will serve. This final set of problems identifies and discusses ^ 
six Issues that the evaluator must face in regard to moral, ethical, and utility 
matters. 

1. Philosophical Stance 

The first issue concerns what philosophical stance will be assumed 
by the evaluator. Will the evaluation be value-free, value-based, or value- 
plural? Each of these positions has its advocates. 

Some say that evaluators should merely provide data, without 
regard for the values of any particular group, such as consumers or producers. 
Persons who take this'position are committed to a value-free social science. 
Their rationale is that evaluators should be objective and should not adopt 
any particular value position as a basis for their work. A consequence of this 
position is that evaluators provide data, but not recommendations. A difficulty 
of this approach Is in determining what data to collect since there is no 
particular value framework from which to deduce criteria. Selection of values 
for Interpreting the findings is left to the audiences for the reports. 
Overall, the value-free option emphasizes the objectivity and neutrality of 
the evaluation but provides no guidance for choosing variables or interpreting 
results. 

A second option is a value-based position. Here the evaluator 
chooses some value position and through his work attempts to maximize the good 
that can be done as defined by this position. The value-based evaluator may 
decide that his evaluation should optimize the Prostestant Ethic, equal 
opportunity for persons of all races, Marxism, or principles of Democracy-- 



ERIC 



64 

to name a few posssibillties. Once he has chosen a value base, the appropriate 
variables that might be assessed and the rules for Interpreting observations 
on those variables are theoretically determined. The value-based evaluator 
Is neither neutral nor objective concerning what purposes his evaluation should, 
serve. His evaluation can be viewed (and critiqued) In terms of Its social 
mission/ 

A third philosophical stance might be termed a value-neutral position 
According to this position evaluators remain neutral concerning the selection 
of a particular value position, but they explicitly search for and use 
conflicting value positions in their collection and interpretation of data. 
Thus they can show the consequences of a particular action ^n relation to the 
different value positions that might be served by the action. 

An example of this third philosophical stance occurred when a team 
of evaluators was commissioned to identify and assess alternative ways of 
educating migrant children. The evaluators identified value positions 
advocated by experts in migrant education and by the migrants themselves. 
The experts said the chosen alternative should be the one that gave best 
promise of developing reading and arithmetic skills. However, the migrants 
urged that the chosen alternative should be the one that would best help their 
children to be socialized into society. These positions represented, for 
the evaluators, conflicting value positions that might be used to search for 
and assess alternative instructional strategies. 

Using either position by itself would produce a biased set of stratec|1es, 
but using both would increase the range of strategies. Using criteria from 
both philosophical positions would produce different evaluations of each 
identified strategy. 

ERIC 



As an example, two alternatives (among others) were identified. One 
was to operate a resident school in the desert, the other to totally integrate 
• the migrant children into regular classrooms. The former strategy rated 

high In meeting 'criteria of improved reading and arithmetic performance, but 
Was a disaster in relation to the socialization objective. The opposite 
was triie for the approach involving total integration. Based on their 
respect for both conflicting value positions, the evaluators identified 
additional alternative strategies that represented a compromise position. 
This example illuj.trates that an evaluator's philosophical stance can haye 

drastic effects on the results that will be produced through his evaluations. 

# * . 

2. Evalvator's Values 

The second problem concerns the evaluJtor's values. Will his 
values and his technical standards conflict with the client system's values? 
Will the evaluator face any conflict-of-interest problems? What will be doner; 
about possible conflicts? Evaluators are often faced with questions like 
these and should deal openly and directly with them. 

An example of a conflict between an evaluator* s technical standards 
and the client system's values occurred in an Evaluation of a "free school." 
The evaluator believed that it was essential to administer achievement tests 
to the school's students. The "free school" administrators said that the 
"free school" philosophy does not permit the testing of students. While this 
was an extreme case it illustrates problems that evaluators may encounter in 
performing what they consider to be necessary evaluation tasks. 

The evaluator can also encounter conflicts. of interest. ^Be^n^Ton 
the payroll of the agency whose work is being evaluated insures that, potential 
conflicts of interest will emerge. The evaluator, being committed to the success 
<^ of the agency— or at least to preserving his job— may find it difficult to report 

/ ■ ■■ 



66 



negative results. This Is also the case when the evaluator has, at some 
previous time, served as a consultant to the agency. It Is" good ethical 
practice for evaluators to Identify, and report their potential conflicts, 
of Interest, to guard against their Influence on their work, and, if necessary, 
to withdraw from the evaluation assignment. 
* 3. ' Judgsnents 

Another Issue the evaluator must face Is whether his reports 
should present judgments (or rerely descriptions) of what has been observed. 
Will the evaluator report no judgments? Will he report his own; or wfll he 
obtain » analyze, and report the judgments of various reference groups? 
The evaluator* s responses to these questions will pretty well determine 
his role In decision making In the activity being lobs erved. 

If the evalf'ator decides to present no judgments, he will leave 

» 

decision making completely up to his client* If the evaluator presents 
his own judgments, he likely will have a strong influence on decision making. 
If he presents judgments of various reference groups, he will not have decision 
making power himself , but will help the chosen reference groups to exercise 
such power. The point here is that the evaluator has options concerning how 
he should treat the matter of judgment in his evaluation ^nd he should weigh 
the consequences of each option against his particular philosophical stance on 
evaluation. 

4. Objectivity 

The fourth problem is that of objectivity . An an evaluator collects 
and reports dat;^ during the course of a program, how can he keep his independence 
If the program personnel adopt his recommendations, how can the evaluator any 
longer be neutral about the merit of related actions? Likewise, how can the 
evaluator avoid being co-opted by program personnel who win his confidence and 



support his ego needs. 

Tom Hastings once told me that "objectivity is a matter of 
Intelligence and integrity." I interpret this to mean that evaluators should 
know whether they have lost their independent perspective and that if they 
have they should ask that they be replaced in the evaluation job. 

5. Prospects for Utility 

A fifth concern in this section is whether the evaluation is merely 
an academic exercise of has real prospects for utility . The criteria of 
relevance, scope, importance, credibility, timeliness, and pervasiveness have 
been mentioned before. It is reiterated here that the evaluator should 
seriously assess and report on the prospects that his evaluation plan has 
for being useful. • 

6. Cost/Effectiveness 

Finally the evaluator should assess the cost/effectiveness of his 
plan. Compared to its potential payoff, will the evaluation be carried out at a 
reasonable cost? Is the potential payoff worth what it will cost? 

This cOTipletes the discussion of evaluation problems. While technical 
matters are a key problem area for the evaluatdr, he must solve many other 
kinds of problems. These include conceptual, sociopolitical, contractual/ 
legal, management, and moral /ethical problems. All such problems 

must be anticipated and avoided if evaluations are to be technically sound, 
useful, and efficient. Consequently, evaluators need a technology by which 
continually to assess their evaluative plans and activities. We consider what 
fom such a technology might have in the next part of this paper. 



ERIC 

I > 



68 



II. A ConceptuaHeation of Heta-Evaluatlon 

Part I of this paper introduced the concept of meta- evaluation through 
presenting certain background factors and problems. The need for meta- 
evaluation was described. An emergent literature of meta*evaluat1on 
was identified. Eleven criteria for Judging evaluation results were 
presented. Also a list of problems commonly encountered in evaluation work 
was presented as a format for guiding ongoing evaluation work and for 
diagnosing the weaknesses of completed evaluation efforts. Overall, Part I 
provides a foundation for a conceptualization of meta-evaluation. 

This second part of the paper contains a conceptual response to the first 
part. Included are a definition of meta-^valuatlon, premises for a conceptual- 
ization of meta- evaluation, and a logical structure for designing meta 
evaluation activities. Taken together- these are suggested is a conceptualization 
of meta-^vaiuation. 

Meta>Evaluation Defined 

In the Introduction, meta-evaluatlon was defined as the evaluation of 
evaluation. More specifically It is defined in this paper as a procedure 
for describing an evaluat-^on activity and judging it against a set of Ideas 
concerning what constitutes good evaluation. 

This, of course, means that meta-evaluatlon is higher-order 
evaluation, and that it Includes evaluations that are secondary, tertiary, 
etc. This presents a practical dilemma, since meta-evaluatlon Involves 

ERIC 



58 

infinite regression, and since It Is not practical to act on the Infinite 
possibilities of evaluating evaluations of evaluations of evaluations • . 
It is cjmphasized that infinite regression is a fundamental part of the 
conceptualization of meta-evaluation. This paper, however, 1$ restricted 
mainly to dealing with second-order evaluations; these are meta-evaluations 
that are once-removed from the primary evaluations. It is assumed that 
secondrorder meta-evaluations are feasible, important, and sufflcient^n 
most practical situations. — ^ ^' 

Premises 

Since meta-^valuation is a form of evaluation, the conceptualization of 
meta-evaluation must be consistent with some conceptualization of evaluation. 
The conceptualization used in this paper hal seven premises. Essentially 
these are the author's responses to the seven questions in conceptualizing 
evaluation that were discussed in the first part of this paper. These 
premises are listed and related to the concept of meta evaluation below. 

1. Evaluation Is the assessment of merit; thus, meta evaluation 
means assessing the merit of evaluation efforts. 

2. Evaluation serves decision making and accountability; thus meta- 
evaluation should provide information pro-actively to support the decisions 
that must be made in conducting evaluation work^and meta evaluation should 
provide retroactive information to help evaluators be accountable for their 
past evaluation work. Another way of saying this is that meta-evaluation 
should be both formative and summative. 

3. Evaluations should assess goals, designs, implementation, and results. 
Thus meta-evaluation should assess the importance of evaluation objectives, 
the appropriateness of evaluation designs, the adequacy of implementation of 



70 

the designs » and the quality and importance of evaluation results. 

4. Evaluation should serve all persons who are involved in and affected 
by the program being evaluated; hence» meta-evaluatlon should serve evaluators 
and all the persons who are interested in their work. 

5. Evaluation should be conducted by both insiders and outsiders; 
generally (but not always) insiders should conduct formative evaluation 
for decision making, and outsiders should conduct summative evaluation for 
accountability. Hence, evaluators should conduct formative meta evaluation 
and they should obtain external judgronts of the overall merit of their 
completed evaluation activities. 

6. Evaluation involves the process of delineating 'the questions to be 
addressed* obtaining the needed information, and using the information in 
decision making and accountability. Hence, meta-evaluators must implement 
three steps. The evaluators must delineate the specific meta-evaluatton 
questions to be addressed. They must next collect, organize, and analyze 
the needed Information. Ultimately they must apply Xhe obtained information 
to the appropriate decision making and accountability tasks. 

7. Evaluation must be technically adequate, useful, and cost/effective. 
Meta-evaluation must satisfy the same criteria. 

A Logical Structure for Meta-Evaluation 

These seven premises have been used to generate the meta-evaluation 
structure that appears in Figure 4. This structure portrays meta-evaluation 
as a methodology for assessing the merit of proposed and completed evaluation 
efforts (the first premise). The framework has three dimensions; they relate 
to the purposes , objects , and steps {the second, third, and sixth premises) 
of meta-evaluation studies. The contents of the cells of the structure reflect 

ERIC 



J- s 



s 



71 




c 



s2 



« •! C 



O C > «/l 
JZ t> V 



o 



> o 



O 1. 




_ _ iP ii O O 

4-> M «J O. U 

OP U90ftlc«l- 
Q. « « O — 

Si si, 




o m 

c 



O "s 
>J cnx: 



11^ 



^•5 



. 2 

O 9 
m 

i 



) 4-f «i > C 



G x; 5 



w 



ex. fiC 



C O 



Ills 




O ^ «l 
VHO o 

tig 




o 

^ w • 

«0 ^ T7 

Or- r- g 

> U C7» 

« « C 



O 



b C > M 




«» O « 



c 

Of 



if 




- CTl «« « <0 

I • c y» > c 3 

: ^ ^ ^ C7» t» «B 

«« ^ <^ c: « 

> <9 o 



2 



>i u o 



(O > 



W ^ w 



55 




j5 

Of 4^ 

o • > u 



o o «» 



O «l 

C • 

I; 



1 ¥^ 




c 


1 the Met? 
Ion Procci 




o 


tinq 




to 


ID 




M 3 


C 


it 


CX^ 






«^ > 

Ui 


a> 

o 





c 
o 

o 

n c 



c o 



o 



41 



c 
o 

it 



c 



o 



5^ 

O 9 

•I > 

M Ui 



C 

o 



> > o 

Ui i. 

I 41 M 

•J -r- 



P 4.i O 
Jh» 01 O «l 
H 4-» O 



9 O VI 

S £5^ 



Ui VI 
V% t 9 



>» 

u •> 

C VI 
O C 



5 a 



> 3 

#^ «» 
O > 

ei 

fit: X 



> 4-> 

u 



o _ 



5 n >,x> 
to 3»— ^ 

r— r— T7 V» 

M ^ ^ a> fc- 
Ui in c 

jC ♦* TJ C 

<u c o 2 

w£ U O 



cx ♦* 



ERIC 



72 

that evaluation work should meet the criteria of technical adequacy, utility, 
and cost/effectiveness (the seventh premise). The structure reaffirms that 
insiders should conduct proactive meta-evaluation and that external agents should 
conduct retroactive meta-evaluation work (the fifth premise). It is an implicit 
assumption of the structure that meta-evaluation findings should serve the 
evaluators whose work is being judged and all persons who are interested in 
their work (the fourth premis^,. Overall, this structure is presented as a gu^de 
for designing meta-evaluation activities. 

Given this overview of the structure each of its dimensions will next be 
considered. Then the interaction of the three dimension^ will be discussed. 

Purposes of Heta-Evaluation 

The first dimension of the matrix indicates that meta-evaluation should 
serve two purposes. They are decision making and accountability. 

Supporting decision making In evaluation efforts requires that meta-evaluation 
be done pro-activsly to provide timely recom!r,endations concerning how evaluation 
studies should be designed and conducted. Meta-evaluation that serves decision 
making may be termed formative meta-evaluation. As noted in Figure 4, 
f^nvtive meta-evaluation usually is conducted by insiders, i.e., those who 
do the evaluation that is being guided by the meta-evaluation. Conducting 
formative meta-evaluation is proposed as a direct way of insuring that 
evaluations will produce results that are technically adequate, useful^ and cost/ 
effective. 

The second purpose of meta-evaluation is to serve the evaluator's need 
to be accountable for his work. This purpose requires that meta-evaluation be 
conducted retroactively to produce public judgrr^ents of the merits of the 
completed evaluation work. Meta-evaluation that serves accountability is 
synonymous with summat:ve meta-evaluation. A careful examination of the 



1 



73 



framework reveals that much of the Information required in summative meta- 
evaluation is potentially available from formative meta-evaluation. Thus» 
formative meta-evaluation potentially can provide a preliminary data base 
for summative meta-evaluation. However, to insure the credibility of the 
results, meta-evaluation for accountability should usually be conducted by 
outsiders. 

Steps in the Heta-Evaluation Process 

The second dimension of Figure 4 indicates there are three basic steps 
in conducting meta-evaluation studies, whether in the decision-making or 
accountability modes. These steps are delineating the information 
requirements, obtaining the needed information, ;ind applying the obtained 
information to achieve decision-n.^Aing and accountability purposes. Thus 
methods for meta-evaluation should assist in determining questions, in 
gathering and analyzing the needed information, and in using the information 
to answer the meta-evaluation questions. 

Objects of Meta->Evaluation 

The third dimension of the structure denotes four objects of meta-evaluation. 
They are evaluation goals , evaluation designs , evaluation processes , and 
evaluation results . 

Evaluation goals pertain to the ends to be achieved by the evaluation. What 
audiences are to be served? What are their questions? What information do 
they want? What information will be provided to them? How is the evaluative 
feedback supposed to influence the actions of the audience? These questions 
illustrate considerations in the formulation and assessment of alternative 
evaluation goals. 

Basically an evaluation goal is an intent to answer certain questions, 
to enlighten some audience, and to influence their actions in the direction 



74 

of rationality. There are obviously alternative possible goals for any 
evaluative effort, hence it is important to identify and assess the com- 
peting evaluation goals. 

The second object of meta-evaluation concerns evaluation designs. Ob- 
viously, there are alternatives. The choice of the appropriate design 
depends on what evaluation goals have been chosen and a variety of practical 
and sociopolitical considerations. Hence, it is important in evaluation work 
to identify and assess alternative evaluation designs. 

The third object of meta-evaluation involves evaluation processes. It 
is one thing to choose a potentially strong evaluation design. It is quite 
another to carry it out. As discussed in Part I of this paper, a variety of 
practical problems can invalidate the strongest of theoretical evaluation 
designs. Hence it is important to identify potential implementation problems 
in relation to chosen evaluation designs and to assess their impact on the 
evaluation results. 

The fourth object of meta-evaluation concerns evaluation results. Were 
the study questions answered? How well? Were the findings communicated to 
the designated audiences? Did they understand the findings? Did they apply 
them? Were their applications defensible given the evaluation results? These 
questions illustrate the considerc>tions in evaluating evaluation results. 

Interactions of the Three Dimensions 

Given these descriptions of the three dimensions of Figure 4t it is 
appropriate to consider their interactions. Basically, Figure 4 identifies 
and characterizes two major classes of meta-evaluation designs. These are the 
Proactive or Formative Meta-Evaluation Designs , and the Retroactive or Summative 
Heta-Evaluation Designs . Each of these classes of designs is further divided 
into four specific types of meta-evaluation designs. These pertain to the 



75 



assessment of evaluation goals, of evaluation designs, of evaluation processes, 
and of evaluation results. Each type of meta-evaluation design is further defined 
by the delineating, obtaining, and providing tasks. Thus, Figure 4 Identifies 
four types of proactive and four types of retroactive meta-evaluation designs. 
Within the figure each design type is defined by the steps in the evaluation process. 

It is to be noted that the proactive meta-evaluation designs all result in 
reconmendations, while the retroactive meta-evaluation designs all result in 
judgments. Proactive meta-evaluation studies assist in choosing evaluation 
goals, choosing evaluation designs, carrying out chosen evaluation designs, 
and attaining desirable evaluation results and impacts. Retroactive evaluation 
results provide assessments of the merits of completed evaluation activities. 

In practice the four types of pro -active meta-evaluation studies are 
usually conducted separately, as they relate to specific decision points 
in the evaluation process. However, the retroactive meta-evaluations are 
often combined into a single summative case study since they pertain to completed 
and interrelated sets of evaluation activities. 

This completes Part II of this paper. In it an overall conceptualization 
of meta-evaluation was presented. Specifically a definition, seven premises, 
and a general structure for meta-evaluation were provided. These were 
suggested as general guidelines that evaluators might use to assess their work. 
The object of the conceptualization is to assist evaluators to Identify and 
ameliorate the problems identified in Part I and to serve the meta-evaluation 
criteria that appeared in the same part of this paper. The third and final 
part of this paper considers how the conceptualization presented in Part II 
can be applied in practice. 



ERIC 



76 

III. Use of the Conceptualization of Meta-Evaluation 

This third and final part of the paper 1s Intended to provide practical 
guidelines and examples for conducting meta-evaluatlons. Specifically, the 
structure Introduced in Part II has been used to generate and describe five 
meta-evaluatlon designs. Examples of real-world activities that match the 
designs are also presented. 

Figure 5 summarises the designs to be discussed within the logical 
structure for meta-evaluatlon that was presented in Part II. There are 
four pro-active designs (1-4) that assist evaluators, respectively to determine 
evaluation goals, choose evaluation designs, carry them out, and use them to 
produce valuable results and impacts. The final design (5) provides a 
suninative assessment of the overall worth of a completed evaluation effort. 
Design #l--for Pro-active Assessment of Evaluation Goals 

Design #1 pertains to pro-active meta-evaluation studies that Identify 
and rank alternative evaluation goals. 

In delineating such studies it Is necessary to identify the audiences for 
the primary evaluation, to identify a range of possible evaluation goals, 
and to identify criteria for rating the goals. The audiences are those 
persons who are to be affected by the evaluation study that is the subject 
of the meta-evaluation. The alternative goals are the alternative reasons that 
members of the audience and the evaluation team have for conducting the study. 
Such reasons may be for decision making and/or accountability, and they may 
refer to specific questions about program goals, designs, processes, and 
results. The criteria for assessing evaluation goals include such variables 



ERIC 



77 



1^ 



C 



C 

o 



10 

> 

§. 

> 



ERIC 



C 

o 

5| 

C « 

•r- > «/) 

UJ M 

«/) I C3 

a. « u 

O -M O 

i/> a: 



41 C 

O 3 

V > 
«/) Ul 
O 4 



55 



«r p la «l >»r- M 
^ L. R U ^ 40 4-> 

C 41 «0 r» Ul 9 
Cni- «/) £ «0 «/) 
•r- O «/) C 9 H- 

•a V < Ui Or o oe 

s 



c c 
o o» 



0} <a oj 

•r- O C 
4-> Qi C 
U4-> E «l 

la c Q M 
I o 

«n o E o-x: 

^ U M E O 

C. 01 IV 

^ O VIJC H- 
1 H- «C 4-> O 



«l 

o 

04-> C 

•o c o 

CM O E 4^ t/t 

^ u «/t 10 e 
a. 4^ 3 o> 

^ t. «/t 49 «/t 
^ O M > «| 
M < UJ Q 

o 



c 



•r- O 

U 4-» C 
« C O 
I U-r* 
p 6 4-» 
U «/) «o 
a. a/) 3-4^ 

U M •« 10 
O «/) > o 
< tiJ o 



C O E 

I* c u 

«o ^ 



'T? O CO 

-O 4^ « 4^ 

C7) QJ «a 4-» #0 

C Ol E E 

c C c o C 

C O O 

«y >> 01 

«o JC c ^ JC c 

4J 4^ 0.4-> ^ 

XI Q. 

o < 



CI 



C Dl 
O C 

^ "T- C 

4-> ^ O 

7- lO OJ C 3 

4* > ^ 0»- 

iJ Ul 10 

? • 0> V> >> ^ 

• « 4/)^ ftlJI< 

0 4> U t- 

At o 01 C O 

£ 4^0"^ X 



s 

o 



I 



Ok E Of 



c 

XI o 
01 ^ 
•O 4-» 
f7> Oi «0 



c cr 

OJ 



c o 

10 4J 
4-» «3 

g-^ E 

C O T> o 
0> >) 01 V> 

« jc c ^ x: c 

4-* 4-» a.43*r- 

o < 



» C L. C 01 E 



C 

o 

4-> 

01 « 

4-* « OJ #0 

U > > 4-» 

10 Ui k. C 

el 0» D 

•O Vt O 

4-> 4> V 

01 o o o 
o: 5: "M < 



> 

OJ Ji£ 
U 

c o 



as scope, importance, tractability, and clarity. Overall the delineating 
activities for Design 1 should clarify audiences for the prirnary evaluation, 
alternative evaluation goals, and criteria for rating the evaluation goals. 

Steps for obtaining the information required by Design #1 include 
logical analysis and r atings of the alternative evaluation goals. The 
logical analysis can be done by the primar/ evaluators or by specially 
conmissioned meta-evaluators. Their analyses should define each goal in terms 

of at least the following questions: / 

/ 

1. Who is to be served by the gbal? 

2. What question will be answered? 

3. Why does the audience want to know that? % 

4. What action will likely be guided through achieving this evaluation 
goal? • 

One way of analyzing the alternative goals is through a matrix with labels for 
alternative evaluation goals as its row headings and the tLove questions as 
Its column headings. 

Once the alternative evaluation goals have been analyzed it is necessary 
to rank them. This is a matter of getting representatives of the primary v 
evaluation team and of their audiences to rate the goals on each selected 
criterion (e.g., for clarity, scope, importance, and tractabili ty). A common 
way of doing this is through use of the Delphi techniqiie. 

After the alternative evaluation goals have been identified and rated, ^ 
recommendations should be formulated concerning what evaluation goaJs shoUld be 
adopted. Ultimately^ the primary evaluators and their clietits must choose the 
objectives that will serve as the basis for their evaluation study. 



79 

A studiy that was condufcted for the Bureau of the Handicapped 
In the U. S.' Office of Education illustrates the use of Design #1. 
This study was directed by Dr. Robert Hammond. The charge was to identify and - 
rate alternative goals for evaluating programs for the educationally 
handicapped. 

Hammond commissioned experts in evaluation and in education for the - 
handicapped to write two position papers; one concerning^what alternative 
evaluation goals should be considered, the other suggested criteria for use in 
rating the evaluation goals. r.\ 

These papers were used as. the basis for a national conference to -dentify 
and rate goals for national and state efforts to evaluate prcjgrams for the ' * \^ 
handicapped. About forty people were invited to attend this jwo/king conference.\ 
These persons were se^lected to be representative of work in |he different areas 
of handicapped; of local, state, and national levels of education; of educational 
evaluation, and of different areas of the country. 

The conference lasted five days. The first day was devoted to reviewing 
and discussing the working papers and especially to choosing criteria for rating 
evaluation goals. The second, third, and fourth days were used in conducing 
three rounds of a Deljphi study> Its purpose was to have the group expand" the-' 
alternative evaluation goals, rate them on the selected criteria, and acKieve. 
a consensus concerning what evaluation goals should be recommended to th^ 
Bureau of the Handicapped. The final day was devoted to preparing the final 
report for the U.S. Office of Education. 

To Dr. Hanvnond*s credit and round-the-clock shifts of clerical personnel, 
the final report was distributed in final form durin<g the last day of the* 
conference. This fact plus the fact that the report reflected thoughtful working 



80 



papers on evaluation goals and criteria and three rounds of a Delphi study 
Is evidence that Design #1 can be employed to serve decision making in 
evaluation. 

Desi_ gn #2--for Pro-active Assessn>ent of Evalua t ion Designs 

Design #2 pertains to pro-active meta-evaluation efforts that identify and 
rank alternative evaluation designs. 

In delineating such studies one identifies alternative evaluation designs 
and criteria for rating the designs. Identifying evaluation designs starts 
with a survey of existing designs in the literature. If such a survey fails 
to turn up appropriate designs, it is necessary to invent new ones. Fortnulation 
of the designs includes matters of sampling, instrumentation, treatments, and 
data analysis. Standard criteria for rating evaluation designs include technical 
adequacy (internal validity, external validity, reliability, and objectivity), 
utility (relevance, importance, scope, credibility, pervasiveness, and timeliness), 
and the prudential criterion of efficiency. 

After the alternative evaluation designs and the criteria for rating them 

have been determined, it is necessary to apply the criteria to the designs. 

59 

Campbell and Stanley's standardized ratings of experimental designs are 
useful in this area. The Euro's Mental Measurements Yearbooks^^ are also 
useful for identifying and assessing published tests that might be a pert of 
the designs. Finally the alternative evaluation designs under consideration 
need to be ranked for their overall merits. 

The description and judgment of alternative evaluation designs leads to a 
recommendation concerning what evaluation design should be chosen. This 



ERIC 



81 

recommendation should be based on docicDentation of the meta-evaluation 

study- The docuiDentatlon should include a reference to the selected 

evaluation goals, a description of the alternative designs that were 

considered, a listing of the criteria that were used to compare the designs, 

and a sunmry of the ratings of the designs. Finally, the recomrnended 

design should be justified in view of the available evidence. 

An Instance of^ Design HZ occurred when the National Institute of 

Education sought to adopt a design for evaluating regional laboratories 

and research and development centers. To achieve this purpose^ NIE contracted 

with the Ohio State University Evaluation Center for the development and 

61, 62 

assessment of alternative evaluation designs. 

The Center engaged two teams of evaluation specialists to generate 

alternative evaluation designs. These specialists were presented with an 

63 

NIE policy statement concerning what decisions should be served by the 
evaluation. The teams were oriented to the nature of activities in labs 
and centers. The teams were given criteria that they should meet in the 
development of their evaluation designs. 

64 

The teams then generated competing evaluation systems. Their reports 
were sent to lab and center personnel who rated the two designs. A pane' 
of four experts were also engaged to evaluate the two designs. A hearing 
was held In Washington to obtain further input concerning the designs. 

Finally, the NIE staff reviewed the available information and chose 
one of the designs. Overall^ the implementation of this meta-evaluation was 
conducted during two months and under a budget of $21,000. 



82 



Design j<3->Proa c ti ve Assessment of the Implementation of a Chosen Evaluation Design 

Design #3 pertains to pro-active meta-evaluation studies to guide the 
implementation of a given evaluation design. 

The delineating tasks in relation to Design #3 are extensive. Based on 
the results of a type #2 rneta-evaluation, an evaluation design has been chosen. 
There are many adrninistrati ve and technical decisions to be made In operationalizing 
the chosen design. The operational characteristics of the chosen evaluation design 
need to be explicated, aitd potential problems in the implementation of the 
design need to be projected. These characteristics and potential problems 
serve as foci for periodic checks on how well the chosen evaluation design is 
being implemented. 

A number of techniques are available for delineating the operational 
characteristics of evaluation designs. These techniques include *'Work 

Breakdown Structure," Critical Path Analysis, and Program Evaluation and 

66 

Review Technique, An additional technique called an Administrative 
C hecklist for Reviewing Evaluation Designs is being introduced here. The 
checklist appears as Figure 6 . It reflects the problems that were described 
in Part I of this paper and is suggested for use in reviewing evaluative 
activity. These techniques are intended for use in delineating the operational 
chardcterist'ics , decision points, and potential problems that relate to the 
ifT.pl emerita* ion of a given evaluation design. 

The actual data gathering and analysis involved in implement'^ ng Keta- 
Evaluation Design #3 involve periodic reviews of the evaluation design and 
monitoring of the evaluation process. These review and monitoring activities 
are intended to determine whether ^he design has been adequately operationalized 
and how well the design is being carried out. Such data gathering activities 



83 



Figure 6 

An Administrative Checklist 
for Reviewing Evaluation Plans 



Conceptualization of Evaluation 



Def ini tion 
^Purpose 
[[Questions 
"Audiences 
^Agents 
"Process 
''standards 



-How is evaluation defined in this effort? 

-What purpose(s) will it serve? 

-What questions will it address? 

-Who wi 11 it serve? 

-Who will do it? 

-Kow will they do it? 

-By what standards will their work be judged? 



Socio>Pol i ti cal Fact ors 
Involvement 



Internal 
""communication 

Internal 
^credibi 1 i ty 

External 

credibi 1 i ty 

Securi ty 



Protocol 

Public relations - 



-Whose sanction and support is required, and how will 

it be secured? 
-How will corrmuni cation be maintained between the 

evaluators, the sponsors, and the system personnel? 
-Will the evaluation be fair to persons inside the 
system? 

-Win the evaluation be free of bias? 

-What provisions will be made to maintain security of 

the evaluative data? 
-What corrvnuni cation channels will be used by the 

evaluators and system personnel? 
-How will the public be kept informed about the 

intents and results of the evaluation? 



Contractual/Legal Arrangements 



^Client/evaluator 
relationship 

_Evaluation 

"products 

^Deli very 

"schedule 

Jditing 
Access to data 



Release of 
reports 

Responsibility 
and authority 
Finances 



--Who is the sponsor, who is the evaluator, and how are 

they related to the program to be evaluated? 
--What evaluation outcomes are to be achieved? 

--What Is the schedule of evaluation services and products? 

--Who has authority for editing evaluation reports? 
--What existing data may the evaluator use, and what 

new data may he obtain? 
--Who will release the reports and what audiences 

may receive them? 
--Have "he system personnel and evaluators agreed on 

who is to do what in the evaluation? 
--Whac is the schedule of payments for the evaluation, 

and who will provide the funds? 



84 



The Technical Design 



J)bjectives and 
'variables 
Jnvestigatory 
framework 

Instrumentation 



JSampl ing 
]]Data gathering 



__Data storage 
"and retrieval 
__Data analysis 
JReporti'ng 

JTechnical 
'adequacy 



The Management Plan 
\ 

Organizatfpnal 

mechanism \^ 



» 

\ 

^Organi zational 
location 

^Policies and 
""procedures 
_,Staff 
Faci i i ties 



Data gathering 
schedule 
Reporting 
"schedule 
Training 



Jnstal lati on of 
""evaluation 

Budget 



-What is the program designed to achieve, in what 

terms should it be evaluated? 
-Under what conditions will the data be gathered, e.g. 

experimental design, case study, survey, site 

review, etc? 

•What data gathering instruments and techniques will 
be used? 

-What samples will be drawn, how will they be drawn? 
-How will the data gathering plan be implemented, 

who will gather the data? 
-What format, procedures and facilities will be used 

to store and retrieve the data? 
-How will the data be analyzed? 

-What reports and techniques will be used to disseminate 

the evaluation findings? 
-Will the evaluative data be reliable, valid, and 

objective? 



•What organizational unit will be employed, e.g., 
an in-house office of evaluation, a self 
evaluation system, a contract with an external 
agency, or a consortium-supported evaluation 
center? 

•Through what channels can the evaluation influence 
policy formulation and administrative decision 
maki ng? 

-What established and/or ad hoc policies and procedures 

will govern this evaluation? 
-How w/ill the evaluation be staffed? 
•What space, equipment, and materials will be 

available to support the evaluation? 
•What instruments will be administered, to what groups, 

according to what schedule? 
•What reports will be provided, to what audiences, 

according to what schedule? 
•What evaluation training will be provided to what 

groups and who will provide it? 
■Will this evaluation be used to aid the system to 

improve and extend its internal evaluation 

capability? 

-What is the internal structure of the budget, how 
wi 11 it be monitored? 



ERLC 



85 



Moral/Ethical/Utility Questions 



^Philosophical 

"stance 

JService 

"orientation 

__Evaluator 's 

"values 



Judgments 



Objectivity 



Prospects for 
utility 

_Cost/effecti ve- 
ness 



--Will the evaluation be value free, value based, 
or value plural? 

--What social good, if any, will be served by this 
evaluation, whose values will be served? 

--Will the evaluator's technical standards and his 
values conflict with the client system's 
and/or sponsor's values; will the evaluator 
face any conflict of interest problems; and 
what will be done about possible conflicts? 

--Will Lie evaluator judge the progratn; leave that 
up to the client; or obtain, ana'y^e and 
report the judgments of various reference groups? 

--How will the evaluator avoid being co-opted and ma^"n- 
tain his objectivity? 

--Will the evaluation meet utility criteria of 
relevance, scope, importance, credibility, 
timeliness and pervasiveness? 

--Compared to its potential payoff will the evaluation 
be carried out at a reasonable cost? 



86 

can be implerriented by evaluation administrators through requiring the evaluators 
to make periodic oral and/or written progress reports. Another means of 
gathering this information is through employing external auditors to make 
periodic checks on the implementation of the evaluation design. 

Feedbac^' rom meta-evaluation Design #3 includes two basic kinds of 
inforTnation. The first is a logging of the actual process of evaluation. This 
will be useful dt the end of the evaluation project for interpretation of 
evaluation results. Another kind of feedback pertains to the identification 
of problems and recanmendations for improving the evaluation activities. This 
type of feedback is important ^or the manager of the evaluation process. 

In practice therd are many instances of meta-evaluations that check on and 
guide the implementation of evaluation designs. Largely these pertain to 
self-assessment activities and sometimes to the employment of external 
consultants. 

D esign #4--for Proactive Assessment of the Quality and Us e of Evaluation Results 

Design #4 provides for proactive meta-evaluation studies that enhance 
the quality and use of evaluation results. 

In delineating the information requirements associated with this design 
type, three things r-ust be done: the evaluation objectives should be noted; 
the meta-evaluation criteria of technical adequacy, utility, and cost/ 
effectiveness shoulo be spelled out in relation to the evaluation objectives; 
and Jie intended users of the primary evaluation results should be designated. 
Delineation of these matters provides a basis for obtaining the information 
needed periodically to asr,ess the quality and impact of Lde evaluation 
infonijtion that is being gathered in the primary evaluation activity. 

O 

ERIC 



87 

A ?lumber of things can be done to obtain information about the quality 
and impact of primaiy evaluation reports. Evaluation reports can be gathered 
and the information they convey can be rated for its validity, reliability, and 
objectivity. Records can be kept of primary evaluation expenditures. Records 
can also be kept of instances of use of the evaluation reports by the intended 
audiences. Also these audiences can be asked to rate the utility of the evalua 
tion reports that they receive. Such information on the effectiveness of 
evaluation can be obtained by the evaluation manager or an external auditor. 

It is to be noted that such meta-evaluation of the effectiveness of 
an evaluation might appropriately be conducted in conjunction with an 
effort to gather meta-evaluation data on the implementation of an evaluation 
design. While both meta-evaluation Designs #3 and #4 are implemented during 
the same time frame, feedback concerning the adequacy of implementation of 
an evaluation activity is relatively more important during the early stages 
of an evaluation project. Conversely, later in meta-evaluation projects 
feedback concerning the effectiveness of an evaluation is more important 
tUdn is feedback about implementation of the primary evaluation design. This 
relitionship between meta-evaluation Designs #3 and #4 is portrayed in Figure 7 

Feedback from meta-evaluation Design #4 includes periodic reports of 
the quality, impact, and cost/effectiveness of the evaluation work. The 
burden of these reports is periodically to rate the success of the evaluation 
results and to provide recommendations for Improving the evaluation effort. 

Des ign ^S-^ for Retroactive Assessment of Evaluation Goals 

With Design #5 we move to the area of retroactive meta-evaluation. In 
practice, the retroactive meta-evaluation of goals, designs, implementation, 
and results usually arc combined into a single summative case study. 



88 

Figure 7 

The Relative Importance of Meta Evaluation Designs # 3 and # 4 
during the Process of ^ ^cuPrimary EvaTuati on Study 




ERIC 



89 

The first main step in implementing Design #5 is to determine the intents 
of the evaluator who conducted the study. What audience did he intend to serve? 
What evaluation goals guided his study? What evaluation design was chosen to 
achieve these goals? How did the evaluator intend to carrv out his design? 
What specific impacts did he think would be achieved? When questions such as 
these have been delineated, the evaluator should next gather relevant data for 
judging evaluation goals, designs, implementation, and results. 

Information should be sought to answer a number of questions about the 
evaluation goals. What did the intended audience say they wanted from the 
evaluation? Were there other legitimate audiences? What information would they 
have wanted? Were alternative evaluation goals considered ly the evaluator? 
What were they? Why were they rejected? Overall, how defensible were the 
evaluation goals? Data in response to these questions are needed 
for judging evaluation goals. 

The evaluator should also compile information about the evaluation design 
that was chosen. How does it rate on criteria of technical adequacy, 
utility, efficiency, and feasibility? Were other designs considered? On 
balance was the chosen design better than others that might have been chosen? 

Given the design that was chosen, the evaluator should next determine 
how well it was implemented. Was it carried out fully? If not, what 
difficulties accounted for the faulty implementation? Did the evaluator 
effectively counter the conceptual, sociopolitical, legal, technical, 
management, and ethical problems that vere described in Part 1? What did the 
implementation cost? Overall, how well was the evaluation design implemented, 
and what specific problems were encountered? 

As a final issue the evaluator should consider what results were 
produced. Were the objectives achieved? What information was produced? How 



90 

good was It? Was it used? By whom? Did they use it appropriately? Overall, 
what impact was made by the evaluation, and how desirable and cost/effective 
was it? 

The infonnation obtained in response to the above questions should be 
combined into an overall report. This report should be written for and dissem- 
inated to the audiences for the primary evaluation that has been scrutinized. 
Through this practice, results of the primary evaluation can be viewed and 
used in regard to both their strengths and weaknesses. 

An exan.ple of Design #5 occurred when a research and development agency 
engaged a team of three meta-evaluators to assess the agency's evaluation system. 
The agency presented a conceptual framework to describe their evaluation system 
and charged the meta-evaluators to assess the agency's evaluation performance 
against the framework. 

The framework appears in Figure 8. The horizontal dimension indicates 
that the agency's evaluation system should address questions about the system's 
goals, plans, processes, and achievements. The left-^^^t verticle dimension 
references the levels of the parent agency, i.e., sys^m, program, and project. 
The third dimension indicates that the evaluation system should be judged 
concerning whether audiences and evaluative questions have bci^n delineated (the 
matter of evaluation goals); whether data collecting and reporting devices 
and procedures have been determined for answering the questions (evaluation 
design); whether evaluation data are actually being gathered and analyzed 
(implementation); and whether results are sound and used appropriately by the 
audiences. 

The combination of the three dimensions of Figure 8 provide 48 cells that 
specifically focused the meta-evaluation work. Agency personnel were asked 

ERIC 



91 

Figure 8 

A Framework for Describing and Judging an Evaluation System 



Organi zational Evaluation 
Levels Attributes 


Goals 


Program 
Designs 


Attributes 

Implementation iP'^sul ts 


Delineation of ques- 
tions and audiences 

OnpTr^ 1 1 nnfl 1 i 7ri t i nn nf 

Level 1 evaluation procedures 

(e.g. and devices 
System) 

Implementation of 
"evaluation procedjres 

Use of evaluation data 
Del ineation 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


n 


12 


13 
17 


14 


15 


16 


18 


19 


20 


Level 2 Opera lior ' ' .cation 
(e.g. 

P r og ram) I mp 1 eme n t a t i o n 

use 

Del ineation 

Level 3 Operational ization 
(e.g. 

Project) Implementation 
Use 


21 


22 


23 


24 


25 


26 


27 


28 




7n 


31 


32 


33 


34 


35 


36 


37 


38 


39 


40 


41 


42 


43 


44 


45 


46 


47 


48 



o 

ERIC 



92 

to generate documentation for what had been done regarding each of the 48 cells. 
For example for cell 15 they were asked to produce data that their evaluators 
had obtained concerning the impacts of the agency on its target population 
and to describe how the impact data had been used. For cell 18 they were asked 
to describe how alternative program designs and criteria for judging them are 
identified in the agency. In cell 22 they were asked to produce procedures 
and Instruments that their evaluators had used to judge program plans. In 
cell 26 they were asked to produce data regarding alternative program designs; 
and in cell 30 they were asked to produce evidence concerning the quality 
and use of their data about alternative program designs. What the meta- 
evaludtors wanted then was information about evaluation goals, designs, 
implementation, and results; and they wanted it for each of the three 
levels in the agency and for the program variables of goals, designs, 
implementation, and results. 

The agency personnel responded by preparing notebooks of information 
that were organized according to the dimensions of their evaluation framework. 
Included were three major parts on system evaluation, program evaluation, and 
product evaluation. Within each part were sections on the evaluation of goals, 
designs, process and results. Each of these sections contained the information 
needed by the meta-evaluators concerning the evaluation's goals, designs, 
implementation, and results • Thus the notebook of information provided the 
initial basis for evaluating the agency's evaluation system. 

The meta-evaluators prepared nine-item rating scales for)each of the 
48 cells. These scales were to be used for rating the quality of the agency's 
evaluation work in each of the 48 cells. 

O 

ERIC 



93 

The meta-evaluators then visited the agency for three days. During that 
time they read pertinent documents, studied the contents of the specially 
prepared notebooks* and interviewed personnel at each of the three levels of the 
agency. Then the meta-evaluators independently completed the 48 rating scales. 
The three observations per cell provided a basis for determining inter-judge 
reliability and for analyzing the strengths and weaknesses of the evaluation 
system. Subsequently the meta-evaluators developed a table that essentially 
was the agency's logical evaluation structure with mean ratings of quality 
in each of the 48 cells. This table was used as a basis for an initial report 
and exit interview with agency personnel. 

Three main findings were presented during that session: (1) the 
agency was generally strong in identifying and assessing alternative plans, 
(2) the agency was somev/hat weak at all organizational levels in assessing 
results, and (3) the program level evaluation was almost non-existent. The 
agency personnel were interested in the judgments of the system cind project — 
level evaluation but were startled and concerned about the poor showing of 
their program^level evaluation. They asked the meta-evaluators to provide 
reconmendations in the written report concerning what could be done to change 
this situation. 

following the visitation the meta-evaluators wrote and submitted 
their final report. It was focused on the 48 cell table of judgments 
that had been prepared on site. However, it was broader than that. 

Generally it addressed ten questions about the agency's evaluation system: 

1. whether it addresses worth and merit questions regarding goals » 
designs, implementation, and main anu side effects; 

2. whether it does so both pro-actively and retroactively in relation 
to* decisions about the four question types. 




3. whether the four question tyi>es and the key audiences for the evaluation 
are explicated at each organizational level; 

4. whether explicit, sound procedures, and instruments have been (or will . 
be) determined for answering the specified questions at each level (Th\o 
concern here is with the criteria of technical adequacy [reliability, internal 
validity, external validity, and objectivity]; utility [timeliness, relevaince, 
importance, pervasiveness, credibility, and scope]; and the prudential criterion 
of the efficiency of the evaluation.); 

5. whether data required to answer the specified questions a\e being 
obtained at each organizational level; 

6. whether data concerning the s,^ified questions systematically are 
being* organized and stored in retrievable form to meet accountability needs 
at each organizational level; 

7. whether the evaluation system is having an Impact on the decisions 
related to the four question types at each level; 

8. whether the evaluation system has the capacity to identify and 
respond to emergent tjvaluative needs at each level; 

9. whether the evaluation system is being implemented so as to enhance 
prospects far systematic evaluation.^)eyond thos« short-term efforts supported 
through externally funded projects; and 

10. whether a strong case can be made that the cost of the evaluation system 
is appropriate for satisfying the criteria enumerated above. 

The report that wa? submitted in regard to the above questions pinpointed 
program-level evaluation as the weakest part of the agency's evaluation work. 
The report further speculated that the programs, themselves, were not taken 
seriously in the agency-- that in fact the agency was only a holding company for 



95 

tnlscellaneous projects* An unexpected effect of the report was that it lead 
to a reorganization of the total agency in order to strengthen both programs 
^and the mechanism for evaluating them. This illustrates that meta-evaluation 
play strong roles in effecting change. 

This concludes Part III and the discussion of meta-evaluation designs. 
An attempt has been made to present general designs that cover the different 
roeta-evaluation assignments. Actual cases rthat relate to the designs have 
been described to demonstrate that meta-evaluations are real and not just 
theoretic. The designs are cryptic, and the examples few; it Is hoped that 
others will extend and Improve on these designs and 'examples. 



96 

Summary 

The purpose of this paper has been to explore the topic of meta- 
evaluation. Part I discussed the need to develop a technology for 
evaluating evaluation; described eleven meta-evaluation criteria; and delineated 
six classes of problems that plague evaluation efforts. Part 11 presented 
a definition, seven premises, c^nd a logical structure for meta-evaluation 
work. Ps^t ni described how the structure might be used through describing 
and illustrating five luieta-evaluation designs. 

It is hoped that this paper will stimulate further actions. Hopefully 
some of the ideas and devices will be of use to persons who evaluate 
evaluations. It is hoped that other persons might be stimulated by this paper to 
further delineate and operationalize meta-evaluation concepts. 

Given the poor quality of evaluation performance in education, 
and the lacK of a research base to guide evaluators, it seems urgent that 
jays of defining and assuring the quality of evaluation work be contrived. 
This paper has beer one attempt to nove the field of evaluation toward 
a technology for evaluating evaluation. 



97 



Footnotes 



Scrivon, Michael. "An Introduction to Mota-Kvaluation, " 
Educational Product R e por t , Vol . 2 f No. 5 (February 
1969) , pp. 36-38. 

Cuba, Egon G. "The Fp.ilure of Educational Evaluation," 
Educational Te chnology , Vol. IX, No. 5 (May 19G9) 
pp. 29-38. 

J. Scriven, Michael. o£. ci_t. 

4. Lessinger, Leon. Every Kid a Winnc^r: Accountability i n 

Education . Palo Alto, Calif: SRA, 1970, 

5. Provus, Malcolm. "In Search of Community." Phi Del ta 

^PPJlILi. 10 (June 1973) p. G5~8T 

6. Seligmanr Richard- "College Guidance Pronram." Mea.surement 

a nd Eva luation and G uicKmc c . Vol. 6 ( vl u n e 1 9*7 3) pp. 127 -129. 

7. American Psychological Association, Amc-rican Educational 

Research Association, and National Council on Measurements 
Used in Education, Joint Committee, 1954. Technical 
rocommonda t ions for pnychologico 1 tests and diagnostic 
techniques. Supplement to Ps ychol . Bui 1 . , 51, No. 2. 

8. Euros, Oscar Krisen (ed.). The Sixth Mental Measu rements 

Yearbook. , Highland Park, N.'j.: The Gryphon Press, 1965. 

9. Canpbell, Don.:)ld T. , ^'^nd Stanley, Julian C. "Experimental 

a nu Quasi -Expcrimontal Desiuns for Research on Teaching," 
H andboo k of Research on Teaching (ed . !^ . J . Gaoe ) , 
Chic: RcTnd McKally ancrco.', VJGI. pp. 171-2^6. 




10. Ibid. 



11. Bracht, Glenn U., and Glass, Gene V. "The External Validity 

of Experiments in Education cin'^i the Social Sc iences 
(Research Paper r^'o* 3, Laboratory of Educational Research). 
Boulder: Hnivor.sity of Colorrulo, October, 1956 (mimeo) . 

12. Stufflebeam, Daniel L. , et al. Ed ucationa l Evaluatio n an d 

Decision Ma k i n g . Itasca, Illinois: I\ E. Peacock 
Publishers, "inc. , 1971. 



ERIC 



13. Krathwohl, David R. "Functions for Experimental Schools 
Evaluation and Their Organization" in Glass , Gene V.; 
Dyers, Maureen L. and Worthen, Blaine R. Recommendal " ^ ons 
for the Evaluation of Experimental S chools P roject s o f 
tTTe U.ST' oTfice'^T' Education : Report of the Experimental 
Schools Evaluation Working Conference: Estes Park, Color<.^do, 
December 1971. University of Colorado: Laboratory of 
Educational Research, February 1972. pp, 174-194. 



98 



14- Stuff Icbeam, Daniel L. 'Part: I: A Conceptual Framework for the 
Evaluation of Fxporinnntnl Schools Projects" in Glass, 
Byersr Wort hen. ibi'-^- l^^P* 128-135. 

15. ScrivoHr Michael, et al.. ''An Evaluation System for Regional ^ 

Labs and R&D Centers." A Report Presented to the DiviiJion 
of Research and Development Resources, National Center for 
Educational Research and Development, U. S. Office of Education, 
Project NO. 1-0857; Grant No. OEG 0-71-4558, August 31, 1971. 

16. Stuff lebeam, Daniel L. , et al. '^Design for Evaluating R & 

D InsLiLuLions and Programs." A Report Presented to tho 
Division of Research and Development Resources, National 
Center for Educational Research and Development, U. S. 
Office of Education. Project No. 1-0857; Grant No. OEG 0-71- 
4558, August 31, 1971. 

17. Turner, Richard L. "Appendix G: Criteria for Comprehensive 

Evaluations and The Appraisal of Evaluation Success in 
Experimental School Contexts" in Guba , Egon G., et al. r 
"The Design of Level III Evaluation for tho Experimental 
Schools Program." A Report Presented to the U.S. Office of 
Education. Project No. HP268G2; Grant No. OEG 0-72-1867 
September 30, 1972. 

le. Cook» Thomas. Evaluation Essa ys. McCutchan, 1974. 

19. Scriven, Michael. "Evaluation Bias and Its Control." 

Berkeley: University of California. February, 1974. 

20. Campbell, Donald T. , and Stanley, Julian C. op cit. 

21. Gephart, W. J., Ingle, R. B., and Remstad, 'r. C. A Framework 

for Evaluating Com par ative Studies . In Henry Cody (ed.) 
Conference on Research in M usic Education U. S. Office 
of Education Cooperative Research Report No. 6 — 1388. 
May, 1967. 

22. Bracht, Glenn H. , and Glass, Gene V. op cit . 

23. Guba, Egon G., and Stuff lebeam, Daniel L. Evaluation; The 

Proces s of Stimulating Aiding, -and Abe tting Insightful 
A ction . Monograph Series in Reading Education, Indiana 
University. No. 1, June 1970. 

24. Stufflebeam, Daniel L. , et al^. Edu c ational Evalu ati on and 

Decision Making , op cTt. 

25. Ibid. 

26 • Scriven, Michael S. "The Methodology of Evaluation," 

Perspectives of Curriculum Evaluation (AERA Monograph 
Series on Curriculum Evaluation, No. 1). Chicago: 
Q Rand McNally 6 Co., 1967. 



ERIC 



99 



27. Office of Education. The First Year of Title I . A Report 

to the Congress, 1967. 

28. Stake, Robert E. "The Countenance of Educational Evaluation, 

Teachers College Record , Vol. 68 (1967) , pp. 523-40. 

29. Scriven, Michael. "The Pathway Comparison Model of 

Eva luatioiffTN^^., January , 1972 (mimec) . 

30. M^tfessel, N.S., and Michael,' W.D. A paradigm involving 

multiple criterion measures for the evaluation of the 
effectiveness of school programs . E ducat io nal and 
Psychological Measurement , 1967, 27, pp. 931-943. 

31 • Provus, Malcolm. Discrepancy Evaluation . Berkeley, 
Calif. : McCutchan Publishing Corporation, 1971. 

32. Stuff lebeam, Daniel L. 0£ cit . 

33. Bettinghaus, Erwin P., and Miller, Gerald R. A Dissemination 

System for State Accountabi 1 ity Programs ; Part II : 
The Relationship of Contemporary Communication Theory 
To Accountlibility Disse minat ion Procedures . Coope r a t i ve 
Accountability Project , Colorado Department of Education. 
June 1973. 

34. Bloom, Benjamin S. (ed. ) Taxonomy of Educational Objectives , 

Handbook I: Cog niti ve Doma J^n. New York: David McKay Co., 
Inc., 1956. 

35. Krathwohl, David R. ; Bloom, Benjamin S., and Masia, 3ertram B. 

Taxo nomy of Educational Ob ject ives, Ha n dbook II Aff ective 
Domai n. Now York: n:)avid McKay Co., Inc., 1964 . 

36. Metfessel, N.S., et al. 0£ cit. 

37. Clark, David L. , and Cuba, Egon G. "An Examination of Potential 

Change Roles in Education," Essay 6 in Ole Sand (ed.), 
Rat i o nal Planning in Cur ricu l um and Instruction . 
Washington, D.C. : National Education Association, Center for 
the Study of Instruction, 1967, pp. 111-134. 

38. Hammond, R. "Context Evaluation jof Instruction in Loca]. School 

Districts." Ed ucational T echn ology, 1969, 9 (1), pp. :.3-18. 

39. Stake, Robert E. op cit. 

40. Campbell, Donald T. , et al. 0£ cit. 

41. O'Keefe, Kathleen. M ethodo l ogy for Educational Field 

Studies . Dissertation, 196 8 . 



ERIC 



100 

42. Scriven, Michael. **Goal-free Evaluation." The Journal of 

Educational Evaluation-Evaluation Commeht . Decen*ber 1972. 

43. Scriven, Michael. "Maximizing the Power of Causal 

Investigations — The Modus ar)erandi Method.** 
July 1973 (mimeo) . * ^ 

44. Reinh^ard^ Diane L. Methodology Development for Input 

'E valuation Using Advocat e a nd Design Teams . Dissertation^ 
J)12. (Ohio State University) 

45. Wolf, Pobert L. The Appli cation of Select Legal Concepts 

to Educational Evalu ation . Dis^rtation, 197 4 . 
(University of IWinoxr^ at Urbana^^harnpaign) 

46. Buros, Oscar Krisen. op cit. , 

47. Lake, Dale G.; Miles, Matthew B- and Earle, Ralph B. Jr. 

Mer.suring Hum an Bch av^ior. Columbia University: Teachers 
College Press. 1973. 

48. Campbell, Donald T. , et al. op cit. 

49. Glass, Gene V. and Stanley, Julian C. Statistical Me t hod s 

^ a nd rf^ychology . Englewood cliffs. New 

Jersey: Pren 1 1 cchT.YJ~U~ IncT 1970. 

50. Winer, B.J. S t a ti st i ca 1 P i^ijic i pie s in Experiment al Design , 

Now Yor,^ : McC^rav;-';!"! 1 1 . \^}T^2. 

51. Guilford, J. P. Fundam ental S tatistics^, in ^^sycholog y an d 

Education. New York': McGraw-Hill. 1965 (4th ed.) 

52. Siegel, S.S. Nonparanot r 1 c Strit i stic s . New York: McGraw 

Hill. 19 56. 

53. Adams, James A . A Manua 1 of Pol i c i es , Organ i zat ion and 

Procodu^ 'cs for i:valuation. Saginaw School District, 1970. 

54. Assessment Council, Ohio State University, College of 

Education. Handbook of Policies and Procedures for 
Eva luat ion . 

55. Root, Darrel 1 K. Educational Evaluation Training Needs of 

Superintendents of Schools. Dissertation, 1971. 
(The Ohio State University) 

56. Adams, James A. op cj^. 

57. Merriman ,v Howard O. The Col umbus School Profile . Columbus, 

Ohio: The Columbus Public Schools, May, 1969. 



ERIC 



101 



58. Scriven, Michael. •'Evaluation Bias and Its Control.*' 0£ cit . 

59. Campbell^ Donald T. ot al. o£ cit. 

60. Euros, Oscar Krisen. op cit. 

61 . Stuf f lebeami Daniel L. Design of a Planning and Assessment ■ System 

for the Division of Manpower and Institutions ^ Proposal 
submitted to The Office of Education by The Ohio State 
University Research Foundation^ June 18, 1971. • ^ 

62. Reinhardr Diane L. o£ cit . 



63. Fry, Charles. "SMI Institutional Support and Evaluation 

Policy," Working Paper, The Division of Manpower and 
Institutions, June 3, 1971. (mimeo) . 



64. Scriven, Michael, et al. "An Evaluation System for Regional 

Labs and R&D Centers." o£ cit. 

65. Stuf flebeam, Daniel L. , et al. "Design for Evaliaating 

R fx D Institutions and Programs." 0£ cit. 

66. Cook, Desmond L. Program Evaluation and Review Technique 

Applications in Education . Washingtori , D.C.: U. 
S. Government Printing Office, 1966. 



